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ABSTRACT 

Given two locations s and f in a road network, a distance query 
returns the minimum network distance from s to t, while a shortest 
path query computes the actual route that achieves the minimum 
distance. These two types of queries find important applications in 
practice, and a plethora of solutions have been proposed in past few 
decades. The existing solutions, however, are optimized for either 
practical or asymptotic performance, but not both. In particular, the 
techniques with enhanced practical efficiency are mostly heuristic- 
based, and they offer unattractive worst-case guarantees in terms of 
space and time. On the other hand, the methods that are worst-case 
efficient often entail prohibitive preprocessing or space overheads, 
which render them inapplicable for the large road networks (with 
millions of nodes) commonly used in modem map applications. 

This paper presents Arterial Hierarchy (AH), an index structure 
that narrows the gap between theory and practice in answering 
shortest path and distance queries on road networks. On the theo- 
retical side, we show that, under a realistic assumption, AH answers 
any distance query in 0(log q) time, where a — dmax/dmin, and 
dmax (resp. drain) is the largest (resp. smallest) Loo distance be- 
tween any two nodes in the road network. In addition, any shortest 
path query can be answered in 0{k + log a) time, where k is the 
number of nodes on the shortest path. On the practical side, we 
experimentally evaluate AH on a large set of real road networks 
with up to twenty million nodes, and we demonstrate that (i) AH 
outperforms the state of the art in terms of query time, and (ii) its 
space and pre-computation overheads are moderate. 

1. INTRODUCTION 

Given two locations s and i in a road network, a distance query 
returns the network distance from s to t, while a shortest path 
query computes the actual route that achieves the minimum dis- 
tance. These two types of queries find important applications in 
map, navigation, and location-based services. To illustrate, con- 
sider that a user of a map service is looking for a nearby Italian 
restaurant for dinner. In response to the user's query, the service 
provider can first retrieve the list of Italian restaurants in the re- 
gion close to the user's current location u. After that, the network 
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distance from u to each restaurant is computed (using a distance 
query), and those distances are returned to the user along with the 
list of restaurants. Then, if the user chooses a preferred restaurant r 
from the list, the service provider can employ a shortest path query 
to provide the user with driving directions from ulo r. 

The classic solution for shortest path and distance queries is Di- 
jkstra's algorithm |9|. It traverses the road network nodes in as- 
cending order of their distances from s; once it reaches t during 
the traversal, it can compute the distance from s to t and can re- 
trieve the shortest path based on the information recorded before 
t is visited. With proper data structures, Dijkstra's algorithm runs 
in 0(n log n + m) time for any shortest path or distance query, 
where n (resp. m) is the number of nodes (resp. edges) in the road 
network. Albeit simple and elegant, Dijkstra's algorithm is ineffi- 
cient for sizable road networks, as it requires traversing all network 
nodes that are closer to s than t, which incurs a significant overhead 
when s and t are far part. 

A plethora of techniques (4H6ll8l[T0ll241 have been proposed to 
improve over Dijkstra's algorithm in terms of either practical effi- 
ciency or asymptotic bounds. The existing methods that focus on 
practical performance mostly rely on heuristics, and hence, their 
asymptotic bounds are unattractive in general. For instance, the 
best heuristic approach by Geisberger et al. [.11] answers shortest 
path or distance queries in at most a few milliseconds even on road 
networks with millions of nodes, but its space and time complexi- 
ties are both 0{n^), i.e., its asymptotic performance is even worse 
than that of Dijkstra's algorithm. On the other hand, the solutions 
that offer favorable query time complexities often entail prohibitive 
preprocessing cost or space overhead, rendering them only applica- 
ble for small datasets. For example, the state-of-the-art approaches 
by Samet et al. 1211 and Abraham et al. 1 4 1 provide superior bounds 
on query time, but they require pre-computing the shortest path be- 
tween any pair of nodes, which is impractical for the large road 
networks commonly used in modem map applications. 

Contributions. This paper presents Arterial Hierarchy (AH), an 
index stmcture that nartows the gap between theory and practice 
in answering shortest path and distance queries on road networks. 
On the theoretical side, we show that, under a realistic assump- 
tion, AH answers any distance query in O(loga) time, where 
a = dmax/dmin, and dmax (fesp. dmin) is the largest (resp. small- 
est) Laa distance between any two nodes in the road network. In 
addition, any shortest path query can be answered in 0(fc + log a) 
time, where k is the number of nodes on the shortest path. On the 
practical side, we experimentally evaluate AH on a large set of real 
road networks with up to twenty million nodes, and we demonstrate 
that (i) AH outperforms the state of the art in terms of query time, 
and (ii) its space and pre-computation overheads are moderate. 
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Figure 1: Road network G Figure 2: Node hierarchy H. 
(with bidirectional edges). 



In a nutshell, AH organizes the nodes in the road network into a 
hierarchy, based on which it pre-computes auxiliary information to 
facilitate query processing. For instance, given the road network G 
in Figure[T] AH constructs a three-level hierarchy H (illustrated in 
Figure|2ll, where each level consists of a disjoint subset of the nodes 
in G. Note that H contains all edges in G, as well as two auxiliary 
edges, (i'9,i'io) and {v\o,v\\), each of which has a length that 
equals the distance between the two nodes that it connects. These 
auxiliary edges are referred as shortcuts, and they can be exploited 
to considerably reduce the numbers of nodes and edges that need 
to be traversed during a shortest path or distance query. 

For example, given a distance query between v\ and vw (in G), 
AH would perform two alternating traversals (in H) starting from 
«i and v\o, respectively, and it would always avoid traveling from 
a higher-level node to a lower-level node. In particular, the traver- 
sal starting from vi can only reach two nodes, uio and un, since 
(i) «ii is the only node adjacent to vi, and (ii) from v\\, AH would 
only traverse to v\o (since vu) is the only neighbor of i^n that is not 
at a lower level than un). Similarly, the traversal starting from wio 
would only reach un. Once the two traversals terminate, the dis- 
tance between v\ and wio is calculated by summing up the weights 
oi {v-i_,v-i_x) and (v^x^vw). 

In general, AH answers any distance query with two traversals 
of the node hierarchy, such that each traversal only moves up from 
low-level nodes to high-level nodes, but not vice versa. We show 
that, for real road networks, the node hierarchy contains 0(log a) 
levels, where a = dmax/dmin- Furthermore, each traversal per- 
formed by AH visits only a constant number of nodes and edges 
in any level of the hierarchy. As a consequence, the total number 
of nodes and edges visited by AH is 0(log a), which results in an 
0(log a) time complexity for any distance query. In addition, once 
the distance between two nodes s and t is computed, AH can de- 
rive the actual shortest path from s to i in 0{k) time, where k is 
the number of nodes on the shortest path. 

The aforementioned time complexities of AH rely on an assump- 
tion on road networks (to be clarified in Section |2). We provide 
detailed discussion on the assumption, and we demonstrate its ap- 
plicability on practical road networks with extensive experiments 
on a large collection of real datasets. These experimental findings 
not only form a basis for our theoretical claims but also shed light 
on the characteristics of real road networks, which paves the path 
for future research on shortest path and distance queries. 



2. PROBLEM AND ASSUMPTIONS 

Let G be a road network. We assume that G is a directed, degree- 
bounded, and connected graph with a node set V and an edge set 
E, such that (i) \V\ = n, (ii) each node in V locates in a two- 
dimensional space, and (iii) each edge e £ E is associated with a 



positive weight l{e). Without loss of generality, we consider that 
l{e) equals the length of e. For any path P in G, we define its 
length 1{P) as the total length of the edges in P. 

We study two types of queries on G, namely, shortest path 
queries and distance queries. Given an ordered pair of nodes 
(s, t) £ V X V,a shortest path query asks for a sequence of edges 
ei, 62, . . . , eft that form a path from s to t, such that X]i=i U^i) 
is minimized. On the other hand, a distance query from s to t 
asks only for the value of X^i^i K^i) instead of the actual short- 
est path. For convenience, we define the distance from s to t as 
dist{s,t) = ELiUeO- 

Our solution for shortest path and distance queries is developed 
based on an observation on the properties of real road networks, as 
explained in the following. 

Observation. Assume that we impose a square grid R on G. Let 
i? be a region containing 4x4 cells in the grid. We define the 
left-most (resp. right-most) column of cells in B as the west strip 
(resp. east strip) of B, and we refer to the vertical line that evenly 
divides B as the vertical bisector of B. We also define B's north 
strip, south strip, and horizontal bisector in a similar manner. For 
example. Figure |4]illustrates (i) a square grid imposed on the road 
network in Figure [T] (ii) a region B covering 4x4 grid cells, and 
(iii) the strips and bisectors of B. 

We observe that, in practice, the shortest paths between the west 
and east strips of B can often be covered by a small set Swc of road 
network edges intersecting B's vertical bisector. That is, given any 
two points in B's west and east strips, respectively, the shortest 
path between the two points should pass through at least one edge 
in Swe- For instance, suppose that B covers the area of a state. In 
that case, any shortest path P between the west and east strips of B 
corresponds to a route that connects the west and east ends of the 
state. Intuitively, P would have to pass through some major intra- 
state highways. Therefore, if Swe contains the road network edges 
on the intra-state highways that intersect B's vertical bisector, then 
Swu should cover any aforementioned shortest path P. Further- 
more, the cardinality of Swe should be small, as there should exist 
only a handful of major highways in the state that go across the 
vertical bisector. Similar statements can be made even when B 
corresponds to a larger region (e.g., a continent) or a smaller one 
(e.g., a city). In addition, we also observe that all shortest paths 
between the north and south strips of B can be covered by a few 
edges intersecting B's horizontal bisector. 

The above observations are similar in spirit to those made in pre- 
vious work 1 4 5, 23 1, which all illustrate that there exists a small 
set of important road network edges or nodes that cover all short- 
est paths connecting distant regions (see Section |5] for a survey of 
related work). In what follows, we will formalize our observations 
and provide empirical evidence, so as to form a basis for further 
discussions in Sections[3]and|4] 

Formalization. Given a region B of 4 x 4 grid cells, we say that a 
road network path Pisa local path in B, if at most one edge in P 
intersects the boundaiy of B. For instance, in Figure |4] the paths 
(u9,U5,U8) and (wn, 1)7,1)4) are both local paths in B. A local 
path in B is the shortest, if it is shorter than any other local path in 
B with the same endpoints. For simplicity, we assume that no two 
local paths in B share the same endpoints and have the same length 
- This assumption can be enforced by adding a small perturbation 
to each edge in G, as shown in Appendix IaI 

We are interested in the local shortest paths between opposite 
strips of B, and a set of edges on B's bisectors that cover all such 
paths, as defined in the following. 



Definition 1 (Spanning Paths & Arterial Edges). 
A local shortest path P in B is a spanning path ofB, if(i) the two 
endpoints of P are on different sides of a bisector of B (denoted 
as lb), and (ii) neither of the endpoints is contained in a grid cell 
adjacent to the bisector It- Any edge on P that intersects It is an 
arterial edge of B. 

By Definition [T] tlie path P = {vg,va,vio,vs) in Figure |4] is a 
spanning pathi of B, since (i) P is a local shortest path of B, (ii) vg 
and vs are on different sides of B's vertical bisector, and (iii) nei- 
ther vg nor iig is in a grid cell adjacent to the bisector. Accordingly, 
the edge {va, wio) is an arterial edge of _B, as it is the only edge 
in P that intersects B's vertical bisector Likewise, (iin, U7, V4) is 
also a spanning path of B, and {vn , ut) is an arterial edge of B. 

As explained previously, the number of arterial edges in a 
(4x4)-cell region B tends to be small in practice, since there usu- 
ally exist only a few major connections between opposite strips of 
B. We formalize this observation as follows. 

Assumption 1 (Arterial Dimension). For any square 
grid on G and any region B with 4x4 grid cells, the number 
of arterial edges of B is at most a constant A, referred to as the 
arterial dimension ofG. 

To demonstrate the applicability of Assumption [T] we conduct 
an experiment on eight real datasets that represent various parts of 
the road network in the United States (see Section |6]for details). 
The weight of each edge in the data equals the time required to 
travel between the two endpoints of the edge. On each dataset, we 
impose a 2"^ x 2"^ square grid (r £ [3, 17]), and compute the number 
of arterial edges for each (4 x 4)-cell region (ignoring the regions 
that are empty). After that, we compute the maximum number of 
arterial edges for a region, as well as the mean, 90% quantile, and 
99% quantile. Figure |3] plots the results as functions of the grid 
resolution r. Regardless of the grid resolution and the dataset size, 
the maximum number of arterial edges for a (4x4)-cell region is 
at most 97, and is below 60 in most cases. Furthermore, the 90% 
and 99% quantiles are at most 60, while the mean is never above 
22. This indicates that practical road networks have fairly small 
arterial dimensions. In Sections|3]and|4] we will exploit this fact to 
construct efficient indices for shortest path and distance queries. 

3. A FIRST-CUT SOLUTION 

This section presents FC (first-cut), an index structure designed 
for road networks with small arterial dimensions. FC is worst-case 
efficient for distance queries, and its space consumption is modest; 
nevertheless, FC is unsuitable for large road networks as it incurs 
significant pre-processing cost. The reasons that we introduce FC 
are (i) it is a conceptually simple method that demonstrates the key 
idea of our proposal, and (ii) with a few modifications and opti- 
mizations, FC can be turned into a scalable method that handles 
both distance and shortest path queries (see Section|4j. 

3.1 Index Construction 

Given a road network G, FC first assigns a level to each node in 
G, such that nodes with higher levels tend to be more important. 
After that, FC organizes the nodes into a hierarchy based on their 
levels, and it adds auxiliary edges between various nodes to facili- 
tate query processing. In the following, we will elaborate how the 
node levels are decided and how the auxiliary edges are created. 

Deciding Node Levels. First, FC imposes on G a (4x4)-cell 
square grid that tightly covers all nodes in G. After that, FC recur- 
sively splits each grid cell into 2x2 smaller cells, until each cell 
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Figure 4: Strips and bisectors of a region with 4x4 cells. 



contains at most one node in G. This results in a sequence of square 
grids with increasing resolutions. Let h be the number of grids thus 
constructed. We use Ri to denote the grid with 2''+^"' x 2'"+^"' 
cells, i.e., Rh is the (4x4)-cell grid that FC first constructed, and 
_Ri is the grid with the highest resolution. 

Let dmax (resp. dmin) be the largest (resp. smallest) Lao dis- 
tance between any two nodes in G. It can be verified that h < 
\og2{dmax / dmin) — 1- Wc uotc that k is always a small number 
for practical road networks: Even if dmax is as large as the length 
of the Equator (~ 4 x 10^ meters) and dmm is as small as 1 meter, 
the value of h is no more than 26. 

Given each Ri (i £ [1, h]), FC computes the arterial edges in 
any (4x4)-cell region in Ri. Let Ai be the set of arterial edges 
obtained from Ri. For any edge in Ai, we define it as a level-i edge 
if it does not appear in Ai+\, . . . , Ah. If an edge does not appear 
in any Ai, then we refer to it as a level-0 edge. In other words, an 
edge has a higher level if it is an arterial edge for a larger region. 
Similarly, we also define the level of each node v in G: we say that 
w is a level-i node if it is adjacent to some edge at level i but not 
any edge at level i + 1, . . . , /i. Intuitively, a higher-level node tends 
to be more important for shortest path and distance queries. 

Creation of Shortcuts. Once the node levels are decided, FC 
organizes the nodes in G into a hierarchy H of h -\- 1 levels 
La, L\, . . . , Lh, such that all level-i (i £ [0, li\) nodes are con- 
tained in Li. For example. Figure |2]illustrates a 3-level hierarchy 
of the nodes in Figure [T] 

The hierarchy H retains all edges in G. In addition, FC inserts 
into H some auxiliary edges, referred to as shortcuts. For any two 
nodes Vs and vt, FC creates a shortcut c from Vs to vt, if the shortest 
path from Vs to Vt only passes through nodes whose levels are lower 
than both Vs's and «t's. Furthermore, the length of c equals to 
the distance from Vs to Vt, i.e., l[c) — dist{vs, Vt). For instance, 
consider the nodes ve,vs,vg, iiio in Figure |2] whose levels are 0, 
1, 1, and 2, respectively. There is a shortcut from vg to «io, since 
the shortest path from vg to nio only goes through va, and the level 
of V(j is lower than those of ug and viq. On the other hand, there 
is no shortcut from vs to vg, since the shortest path from vs to vg 
passes through vio, whose level is higher than both ug's and ug's. 

The shortcuts inserted into H enable us to avoid visiting unim- 
portant nodes when processing distance queries. For example, 
given the shortcut c from vg to uio in Figure |2] we can determine 
that dist{vg, vio) = l{c), without having to compute the actual 
shortest path from vg to viq. In general, for any two nodes Vs and 
Vt in H, there exists a path from Vs to vt that bypasses unimpor- 
tant nodes with shortcuts, as will be explained in Section IX2l For 
simplicity, we will use the teiTn "edge" to refer to either an original 
edge or a shortcut in H, unless otherwise specified. 
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Figure 3: Arterial dimensions of real road networks. 



3.2 Query Processing 

Consider a query q that asks for the distance from a node s in 
G to another node t. Given the node hierarchy H, FC answers q 
with two concun'ent traversals of H that start from s and t, respec- 
tively. Each traversal is performed using a constrained version of 
Dijkstra's algorithm ||9], as explained in the following. 

Traversal Algorithm. The traversal from s maintains a hash table 
Ts and a priority queue Qs. The hash table Ts maps each node v 
in G to a value Hs (v), which equals the length of the shortest path 
from s to f that has been found so far. Initially, we have his{s) — 
and k{v) = +00 for any other node v. 

Meanwhile, each entry in the priority queue Qs corresponds to 
a certain node v' in G, and the key of the entry equals ^3(1;'). In 
the beginning of the traversal, Qs contains only one entry, which 
corresponds to s. Subsequently, FC iteratively extracts (from Qs) 
the node u with the smallest key. For each u extracted, FC inspects 
every edge (it, v) in H that starts from u, and it checks whether 
V satisfies certain constraints. (We will clarify these constraints 
shortly). If v violates any of the constraints, it would be ignored; 
otherwise, FC would further check whether his{u) + l{{u,v)) < 
Ks(u), i.e., whether the path from s to u via u is shorter than all 
known paths from s to v. If the inequality holds, then FC sets 
Ks{v) = Ks(m) + l{{u,v)) and inserts v into Qs (iff has not been 
inserted before). 

The traversal from t also maintains a hash table Tt and a priority 
queue Qt. It is performed in a manner similar to the traversal from 
s, with one notable difference: Whenever FC extracts a node u 
from Tt, it only inspects the edges {v, u) thai points to u. In other 
words, the traversal from t focuses on paths that end at i. 

FC conducts the above two traversals in a round-robin fashion, 
i.e., it extracts nodes from the two priority queues Qs and Qt in 
turns. To determine when the traversals can be terminated, FC 
maintains a variable 6 that records the length of the shortest path 



from s to f that is seen so far. Initially, 9 — +00. After that, for 
each node u extracted from either priority queue, FC retrieves its 
key Ks(m) in the hash table Ts, as well as its key Hit{u) in T. Re- 
call that Us (u) (resp. nt (u)) records the length of the shortest path 
from s to u (resp. from u to t) found so far. Therefore, the shortest 
path from s to i should be no longer than Hs (u) + tit (u) . Based on 
this, if Ks{u) + Kt{u) < 6, then FC would update 6 and set it to 

Ks(ll) + Ht{u). 

Whenever 6 is no more than the smallest key value in Qs, we 
know that dist[s, u) > for any node u remaining in Qs, which 
indicates that u cannot be on the shortest path from a to t. In that 
case, FC would terminate the traversal from s. Similarly, the traver- 
sal from t is stopped if 9 is no more than any key values in Qt- 
When both traversals are terminated, FC returns 9 as the answer to 
the distance queiy. 

Constraints on Node Traversals. As mentioned above, whenever 
FC extracts a node u from a priority queue (either Qs or Qt), it 
inspects the neighbors of u, and it processes only those neighbors 
V that satisfy certain constraints. Specifically, there are two con- 
straints on v. 

1. Level Constraint: v should not be at a level lower than us. 

2. Proximity Constraint: Let i be the level ofv{i£ [0, h — 1]). 
If u is extracted from Qs (resp. Qt), then v and s (resp. t) 
should be covered in the same (3x3)-cell region in Ri+\. 
(Recall that Ri is a square grid with 2''+'^-^ x 2'"+^"' cells.) 

Both of the above constraints are intended to improve the effi- 
ciency of FC. In particular, the level constraint helps FC bypass 
unimportant nodes during query processing. For example, consider 
that we use FC to compute the distance from ng to «ii in Figure |2] 
As explained previously, FC would invoke two traversals starting 
from vs and v\\, respectively. Since v\i is at level 2, the traversal 
from wii would only visit the neighbors of wn that are at levels 
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Figure 5: Illustration of the proximity constraint. 



no lower than 2. As a consequence, «io is visited (since its level 
equals 2), while vi and vg are bypassed. After that, the traversal 
would visit any neighbor of Uio whose level is no lower than that 
of 1)10. Since none of the neighbors of uio fulfills this requirement, 
the traversal terminates. Similarly, the traversal from vg would first 
visit two of vs's neighbors, V7 and wio, ignoring the remaining 
neighbor vs, since v^'s level is lower than that of vs. After that, 
the traversal visits only vn and terminates, as all other remaining 
nodes violate the level constraint. In summary, the two traversals 
by FC visit only four nodes: ny, W8, uio, and t^n . 

Meanwhile, the proximity constraint ensures that FC only 
searches a small number of grid cells in each level of the node 
hierarchy H. For example, suppose that we are given the node 
hierarchy in Figure |5] and we use FC to compute the distance from 
vi to ve- Among the two traversals invoked by FC, the one starting 
from vi would first visit V2, and then vs and vr. The node vr has 
a neighbor vg, which is at the same level as vr, i.e., vg satisfies the 
level constraint. However, vg would still be ignored by FC, as it 
violates the proximity constraint. In particular, vg is a level-1 node, 
but there does not exist any (3x3)-cell region in 7?2 that can cover 
both vi and vg. In contrast, the node V4, which is a neighbor of U3, 
would be visited by FC as it satisfies both the level and proximity 
constraints. Specifically, the level of V4 equals 2, which is no less 
than that of v^; furtheiTnore, V4 and vi are contained in the same 
(3x3)-cell region in Rg. Note that, although FC ignores vg, the 
correctness of the query result is not affected, since vg is not on the 
shortest path from vi to v^. 

In general, the proximity constraint guarantees that in each level 
i of the node hierarchy, FC only traverses the nodes contained in 
two (5 X 5)-cell regions, which are centered at the source s and des- 
tination t of the query, respectively. In particular, the region cen- 
tered at s (resp. t) is the union of all (3x3)-cell regions that cover 
s (resp. t). This, when combined with the level constraint, ensures 
that FC is worst-case efficient in terms of query time, as will be 
shown in Section[331 



3.3 Complexity Analysis 

In this section, we will prove that FC takes 0{hn) space, and it 
answers any distance query in 0{h^) time, where h is the max- 
imum level in the node hierarchy H, and n is the number of 
nodes in the road network G. In addition, we will discuss the pre- 
computation time of FC. 



Query Time. As explained in Section[32] FC answers any distance 
query by two traversals on the node hierarchy H, starting from the 
source s and destination t of the query, respectively. Due to the 
level and proximity constraints, each traversal of FC visits any level 
of H at most once; in addition, for the i-th level (i G [0, h]), each 
traversal only examines the nodes in a (5 x 5)-cell region in the grid 
Ri+i- A natural question is: How many level-i nodes are there in 
the (5x5)-cell region? The following lemma provides an answer. 



Lemma 1. Any (axa)-cell region in Ri contains 0{a A) 
level-i nodes in H, where A is the arterial dimension of G. 

To explain the rationale behind Lemma[T] recall that each level- 
i node in H is adjacent to an arterial edge in a (4x4)-cell region 
in Ri. Furthermore, each (4x4)-cell region in Ri has at most A 
arterial edges. For any (a x a)-cell region in Ri, it can overlap 
with 0{oL^) regions of 4 x 4 cells, and hence, it contains 0{o?\) 
level-i nodes. 

Observe that any (5x5)-cell region in Ri+\ corresponds to a 
(10 X 10)-cell region in Ri. By Lemma [T] this region contains 
0(A) level-i nodes in H. In other words, the number of level-i 
nodes visited by each traversal of FC is 0(A). Given that H has 
ft + 1 levels, the total number of nodes traversed by FC is 0{h\). 

Next, we will show that each node in H has 0[h\) edges that 
satisfy the level constraint. (The edges that violate the constraint 
can be removed from H beforehand, as they would never be tra- 
versed by FC for any query). Consider any level-i node u, and any 
node V whose level is at least i. By the way that H is constructed, 
there is a shortcut connecting u to v, if and only if the shortest path 
between u and v only goes through nodes at levels lower than i. 
Intuitively, this indicates that u and v should not be too far apart 
from each other; otherwise, the shortest path between u and u in G 
would be a path that connects two distant locations, in which case 
the path might contain some highly important node at a level higher 
than i, due to which there would not be any shortcut between u and 
V. More formally, we have the following lemma: 

Lemma 2. Let P be a shortest path in G, such that no (3x3)- 
cell region in Ri (i £ [1, h]) can cover all nodes in P simultane- 
ously. Then, P must contain an arterial edge of some (4x4)-cell 
region in Ri. 

By Lemma [2] u and v must be covered in the same (3x3)-cell 
region in Ri+i; otherwise, the shortest path between u and v must 
pass through a level-(i+l) node, for which there cannot exist any 
shortcut between u and v. This implies that v must be in the (5 x 5)- 
cell region in Ri+i that is centered at u. By Lemma[T] this region 
contains 0{X) level-i nodes in H. With a similar analysis, it can be 
shown that the region also covers 0(A) nodes at any level higher 
than i. Therefore, the total number of edges adjacent to it is 0{hX). 

In summary, FC answers any distance query with two con- 
strained Dijkstra search, each of which traverses 0{h\) nodes and 
0{h^\^) edges. As such, the time complexity of each traversal 
equals 0(/iAlog(/iA) + h^\^). Given that the arterial dimension 
A of the road network G is a constant, the overall time complexity 
ofFCisO{h^). 

Space Complexity. Recall that the node hierarchy contains h + 1 
levels, each of which contains 0{n) nodes. In addition, each node 
in H has 0{hX) edges. Therefore, the space consumption of FC is 
0{hn) when A is constant. 

Preprocessing Cost. The pre-computation of FC consists of two 
steps: First, we identify the arterial edges in any (4x4)-cell region 
in any grid Ri (i G [1, h]); After that, we decide the level of each 
node and we connect pairs of nodes with shortcuts. The identifi- 
cation of arterial edges requires computing the shortest paths in all 
(4x4)-cell regions in all Ri, which incurs considerable overhead, 
especially when the granularity of the grid is low. Similarly, the 
construction of shortcuts is time consuming as it requires deriving a 
larger number of shortest paths (between nodes that are potentially 
far apart). Such significant preprocessing cost renders FC only ap- 
plicable for small road networks. In Section]?] we will address this 
issue and present a modified and scalable version of FC. 



3.4 Correctness Proof 

Let P — {vi,V2, ■ ■ ■ ,Vk) he ^ shortest path in G. Let P' be a 
path from vi to Vk on the node hierarchy H, such that FC reports 
1{P') as the distance from ?;i to Vk- We will prove the coiTectness 
of PC's result by showing that 1{P') — 1{P)- In particular, we will 
show that both 1{P') > 1{P) and 1{P') < 1{P) hold. 

Proving 1{P') > 1{P)- Recall that every shortcut on H corre- 
sponds to a path in G. Therefore, if we replace each shortcut in P' 
with the corresponding path, we can transform P' into a path P", 
such that (i) P" does not contain any shortcut, (ii) P" connects 
vi to Vk, and (iii) 1{P') = l(P"\ On the other hand, we have 
1{P") > 1{P), since P is the shortest path from ?;i to Vk in G. 
Therefore, 1{P') = 1{P") > 1{P). 

Proving 1{P') < 1{P)- Assume for simplicity that P contains 
a node Vj (j € [1, fc]) whose level is higher than that of any 
other node on P. (Our analysis can be easily extended to the 
case when the highest-level node on P is not unique.) Let Pi = 
(ui,i;2, . . . ,Vj) and P2 = (vj,Vj+i, . . . ,Vk}. In the following, 
we will show that the node hierarchy H contains a path Pi from 
vi to Vj that has the same length with Pi . Furthermore, we will 
prove that the sequence of nodes on Pi satisfies both the level and 
proximity constraints, i.e.. Pi can be identified by FC with a traver- 
sal starting from vi. In a similar manner, it can be shown that H 
contains a path P2 from Vj to Vk, such that /(P2) = 1{P2), and that 
P2 can be found by FC with a constrained Dijkstra search starting 
from Vk- This would lead to 

l[P') < l{Pi) + l{Pi) = l{Pl) + l{P2) = 1{P). 

Consider the path Pi — {vi,V2, ■ ■ . ,Vj}. Suppose that we re- 
move from Pi any node Vi (i G [1, i]) that has a smaller level than 
some node Va (a < i) preceding it. Let S — {v'i,v'2, . . . v'f,) be the 
sequence of nodes remaining on Pi. We have uj = v\ (since no 
node precedes v\), and v'f, = Vj (since Vj is the highest-level node 
in P). For instance, if Pi contains six nodes vi, V2, V3, V4, V5, vg 
at levels 1, 0, 2, 2, 1, 3, respectively, then 5* — (wi, W3, W4, we). 

By the way that S is constructed, for any v'^ (i £ [1, 6—1]), the 
shortest path from v'^ to u^_|_i contains only nodes whose levels are 
smaller than those of v'i and v'i^i. As such, the node hierarchy 
H would contain a shortcut from v'i to w^^i, and the length of the 
shortcut equals dist{Vi,Vi^i). This indicates that H contains a 
path Pi = (iii,U2, . ..«{,) that connects «i to Ufc, such thatZ(Pi) = 
/(P). Furthermore, Pi satisfies the level constraint, since the level 
of W; (i £ [1, 6—1]) is no larger than that of f^+i. 

Assume to the contrary that Pi does not satisfy the proximity 
constraint. Then, there should exist a node v'^ (a G [1, 6—1]) on 
Pi, such that (i) v'^ is at level i (i € [0, h]), but (ii) no (3 x 3)- 
cell region in Ri+i covers both v'i and u^+i. Then, by Lemma[2l 
the shortest path from v'i to v'^ must contain an arterial edge e of a 
(4x4)-cell region in Ri+i, since none of the (3x3)-cell regions in 
Ri+i covers both v'i and v'a- In that case, each endpoint of e has 
a level at least i + 1. In other words, on the shortest path from v'i 
to v'a, there exists some node whose level is higher than that of v'a 
(recall that v'^ is at level i). This contradicts the assumption that v'^ 
has a level no lower than any node preceding it on Pi . 

In summary, the node hierarchy H contains a path P/ from vi 
to Vj, such that Pi has the same length with Pi and satisfies both 
the level and proximity constraints. Therefore, FC can correctly 
identify the distance from vi to Vj with a traversal starting from vi . 
Similarly, we can show that FC can correctly compute the distance 
from Vj to Vk with a traversal starting from Vk- This proves the 
correctness of the query processing algorithm of FC. 



4. ARTERIAL HIERARCHY 

This section presents Arterial Hierarchy (AH), a scalable index- 
ing method built upon the FC approach introduced in Section |3] 
Compared with FC, AH has the same space complexity, a simi- 
lar time complexity for distance queries, but significantly smaller 
pre-computation cost. In addition, AH also supports shortest path 
queries in a worst-case efficient manner. 

4.1 Overview 

The main structure of AH is a node hierarchy H* that resembles 
FC's node hierarchy H. In particular, both H* and H have h + 1 
levels, and both of their i-th levels (i £ [1, h]) are associated with 
a square grid Ri of 2''+^"' x 2''+^"* cells. However, AH and 
FC differ substantially in the ways that they decide node levels, 
construct shortcuts, and process queries. 

Differences in Node Levels. To compute the level of each node, 
FC first imposes each Ri on the road network G, and then computes 
the arterial edges in each (4x4)-cell region in Pi, after which FC 
decides the node levels based on the arterial edges. As discussed in 
Section [33] the derivation of arterial edges could incur significant 
overheads, since each (4 x 4)-cell region in a coarse grid may cover 
a large number of nodes and edges in G. 

In contrast, AH computes node levels with an incremental al- 
gorithm that substantially improves efficiency. Given G, it first 
imposes the grid Pi on G. Based on Pi, it identifies a set of unim- 
portant nodes in G, and it assigns them to level of the node hi- 
erarchy H* . Then, it removes a subset of the unimportant nodes 
from G, and constructs shortcuts between the remaining nodes. 
This results in a reduced graph Gi that is considerably smaller 
than G. After that, AH recursively reduces Gi into smaller graphs 
G2,Gi, . . . Gh, during which it assigns nodes to higher levels of 
H* . For the reduction from Gi to Gi+i, AH needs to impose the 
grid Pi+i on Gi and compute the shortest paths in each (4x4)-cell 
region. However, this computation is inexpensive since (i) Gi has 
a much smaller size than G, and hence, (ii) each (4x4)-cell region 
in Ri+i contains only a small number of nodes and edges in Gi. 

Differences in Shortcuts. FC creates only necessary shortcuts to 
ensure correct results for distance queries under the level and prox- 
imity constraints. In contrast, the shortcuts constructed by AH are 
not only for processing distance queries under the level and prox- 
imity constraints, but also for computing the actual shortest path 
between any two given nodes. Specifically, every shortcut {va,Vc) 
in AH's node hierarchy H* is associated with a node vt, such that 
(i) both {va,Vb) and {vt,Vc) are edges in H*, and (ii) the length 
of {va, Vc) equals the lengths of {va, vt) and {vb,Vc} combined. In 
other words, {va,Vc) can be transformed into a two-hop shortest 
path {va,Vh,Vc)- As such, given any path P' in //*, we can trans- 
form P' into a path in G, by recursively replacing each shortcut in 
P' with its corresponding two-hop path. 

For example, Figure ?? illustrates a shortest path 
{vi,V2, ■ ■ ■ ,vq) in G, as well as three shortcuts {vi,V4,), 
{v2,V4,), and {v4,ve). The shortcut (111,1)4) is associated with 
the node V2, since i^i is directly coimected with V2 and V2 is 
directly connected with V4. Similarly, (i;2,t'4) and {v4,ve) are 
associated with V3 and V5, respectively. Now suppose that, given 
a distance query from vi to va, AH identifies P' = (tii, W4, vs) as 
the shortest path from vi to ue in H* . To derive the actual shortest 
path from vi to ve in G, AH first replaces the shortcut {vi, 114) in 
P' with a two-hop path (1)1,1)2, "04), since (^1,1)4) is associated 
with V2. This transforms P' into another path (ui, ?;2,U4, ue)- 
After that, we can replace {v2, V4) with {v2,V3,V4), and substitute 
(d4,ub) with {v4,V5,va}. As such, we obtain the shortest path 



{v-i,V2, . . . , He) from vi to ve in G. 

In general, given any shortest path query from a node s to an- 
other node t, AH first computes the shortest path P' from s to t 
in H* , and then it converts P' into the corresponding path P in 
the original road network. The conversion from P' to P takes only 
0{k) time, where k is the number of edges in P. This is because (i) 
for any shortcut in //*, we can identify its corresponding two-hop 
path in 0(1) time, and (ii) converting P' to P requires only 0{k) 
replacements of shortcuts. 

Differences in Query Processing. Besides the aforementioned 
shortcuts (for reconstructing shortest paths), the node hierarchy H* 
of AH also contains some extra shortcuts that can be leveraged for 
higher query efficiency. As a consequence, AH's query process- 
ing algorithm is slightly more sophisticated than PC's, as will be 
elaborated in Section l431 

4.2 Index Construction 

Similar to the case of FC, AH constructs its node hierarchy H* 
in two steps: it first assigns each node in G to a level in H", and 
then it constructs shortcuts in H* for query processing. 

Deciding Node Levels. Given the road network G, AH first im- 
poses on G the grid Ri, where each cell contains at most one node. 
After that, AH identifies all (4 x 4)-cell regions in _Ri that cover at 
least one node in G. For each of the (4x4)-cell region identified, 
AH computes the arterial edges of the region in 0(1) time, and it 
marks each endpoint of an arterial edge as a level-1 core. At the 
same time, AH assigns all unmarked nodes to level of the node 
hierarchy H* since, intuitively, those nodes are less important than 
the level-1 cores. After that, if any (4x4)-cell region B contains a 
local shortest path P from a level-1 core u to another level-1 core 
V, such that P only goes through unmarked nodes, then AH inserts 
into G a shortcut {u, v) with the same length as P. We say that 
{u, v) is a shortcut generated from B, and we use Gi to denote the 
modified version of G with all shortcuts added. Overall, the com- 
putation of level-1 cores and the construction of Gi take only 0{n) 
time, since the number of non-empty (4x4)-cell regions in Ri is 
0{n), and each of those regions contains 0(1) nodes and edges in 
G (recall that G is degree-bounded). 

For example, given the road network G and the grid 
Ri in Figure [4] assume that AH identifies 5 level-1 cores: 
V7, wg, ug, uio, wii. After adding shortcuts, G is transformed into 
the graph Gi in Figure |6] There exists a shortcut {vg, wio) in Go 
since (i) both ug and wio are level-1 cores, and (ii) in the (4x4)-cell 
region B illustrated in Figure |4] the local shortest path between vg 
and uio goes only through va, which is unmarked. In general, the 
shortcuts in Gi ensure that the level-1 cores form a connected graph 
even if we remove all unmarked nodes from Go. 

Given Gi, AH selects a subset of the level-1 cores in Gi that 
are deemed more important than the others. The selected nodes are 
marked as the level-2 cores, while the remaining level-1 cores are 
assigned to level lof H* . After that, AH converts G\ into a smaller 
graph G2 that retains all level-2 cores. This procedure is applied in 
a recursive manner: In the i-th recursion (i € [1, ft — 1]), AH picks 
level-(i+l) cores from the level-i cores in Gi, and then assigns the 
un-picked ones to level iof H* , after which it transforms Gi into a 
smaller graph Gi+i. 

A natural question is: Given Gi, how should AH select the 
level-(i+l) cores from the level-i cores? One straightforward solu- 
tion is to construct a subgraph of Gi that contains only the level-i 
cores, and then compute the arterial edges in the subgraph to iden- 
tify the more important nodes as level-(i+l) cores. For example, 
given Gi in Figure |6l we can first construct a subgraph of Gi 
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that contains only the five level-1 cores (i.e., vr , vs , vg , viq , vn) 
and the edges connecting them (i.e., the five edges on the loop 
{u7,«8, wio, W9, ^11, ^7))- After that, we impose the grid R2 on 
the subgraph, compute the arterial edges, and then mark the end- 
points of the arterial edges as level-2 cores. While this approach is 
intuitive, we find that (i) the resulting node hierarchy does not guar- 
antee query coiTectness under the level and proximity constraints, 
and (ii) without the level and proximity constraints, it is difficult to 
achieve favorable asymptotic bounds on query time. To address this 
issue, we adopt a more careful approach to choose the level-(i+l) 
cores without affecting the applicability of the level and proximity 
constraints in query processing. Specifically, our approach utilizes 
the concept of border nodes: 

Definition 2 (Border Nodes). Let B be a (4x4)-cell re- 
gion in Ri (i £ [1, h]). A node v in G is a border node of B, if(i) 
V is not contained in the 2x2 cells centered at B, and (ii) v is an 
endpoint of an edge in G that intersects the boundary of the east, 
west, south, or north strip of B. 

For example, in Figure |4] ?;i,«2, fg, wn are all border nodes 
of the (4x4)-cell region B, since each of them is an endpoint of 
an edge that intersects the boundary of B's west strip, and none 
of them is contained in the 2x2 cells centered at B. Similarly, 
V3,V4,vr,vs are also border nodes of B. On the other hand, vq 
and i;io are not border nodes of B, since they are not adjacent to 
any edge that intersects the boundaries of B's four strips. 

To select level-(i+l) cores from Gi, we first reduce Gi by re- 
moving any node in Gi that is neither a level-i core nor a border 
node of any (4x4)-cell region in Ri+i. We use G'i to denote the 
reduced graph thus obtained. For instance, given Gi in Figure |6l 
we would remove vs and va, since none of them is a level-1 core 
or a border node in R2. FigureQillustrates the reduced graph Gi, 
with the border nodes in R2 highlighted. 

Given G'i, we impose Ri+i on G'i and inspect each (4x4)-cell 
region in Ri+i that contains at least one node. For each such region 
B, we compute every spanning path of B (see Definition [TJ that 
satisfies two conditions: 

1 . Border Condition: The two endpoints of the path are border 
nodes of B, while the other nodes are all level-i cores. 

2. Coverage Condition: Every shortcut on the path is generated 
from a region completely covered by B. 

For example, in the (4 x 4)-cell region in Figure|7] the spanning path 
{■i;2,U9,uio,i'8, i^s) satisfies both the border and coverage condi- 
tions, since (i) both V2 and 113 are border nodes, and (ii) the only 
shortcut on the path, {vg, nio), is generated from the region B in 
Figure |4] which is contained in the current (4x4)-cell region. 
For each spanning path P that fulfills the border and coverage 



conditions, if it connects the west and east (resp. north and south) 
strips of B, we identify the edgqj in P that intersects B's vertical 
bisector (resp. horizontal bisector) as a pseudo-arterial edge of B. 
Observe that each pseudo-arterial edge of B corresponds to a path 
in G that contains an arterial edge of B. Intuitively, this indicates 
the importance of pseudo-arterial edges in the reduced graph G^. 
Accordingly, we mark the two endpoints of every pseudo-arterial 
edge as level-(i+l) cores, and we assign all unmarked level-i cores 
to the i-th level of the node hierarchy H* . After that, for any local 
shortest path in a (4x4)-cell region B' , if (i) the two endpoints of 
the path are either level-(i+l) cores or border nodes of B' , and (ii) 
other than its endpoints, the path does not go through any level- 
(z+1) core, then we insert into G'i a shortcut between u and v with 
the same length as the local shortest path. Once all such shortcuts 
are added, we define the resulting graph as G'i^i, and use it to 
recursively compute higher-level nodes in H* . 

It remains to show that we can efficiently derive the pseudo- 
arterial edges and construct shortcuts in G'i. Let B be a (4x4)- 
cell region in Ri+\, and u be a border node of B. Suppose that 
we invoke Dijkstra's algorithm to start a traversal of G'i from it; 
for each node visited, we follow the outgoing edges of the node, 
ignoring any edge that violates the border condition or coverage 
condition. Once the traversal terminates, we can obtain the span- 
ning paths of B starting from u, as well as the pseudo-arterial edges 
on those edges. Similarly, with a traversal from u that follows only 
the incoming edges of each node, we can compute the desired span- 
ning paths of B ending at u, along with the pseudo-arterial edges 
therein. By repeating this process on all border nodes of B, we 
can derive the set of all pseudo-arterial edges in B. With the same 
traversal algorithm, we can construct all shortcuts in B using two 
traversals from each border node of B. 

Creation of Shortcuts. After the level of each node is decided, 
AH adds shortcuts in the node hierarchy H* to facilitate query pro- 
cessing. The construction of shortcuts requires as input a strict total 
order on the nodes in the same level of H* . We will elaborate our 
ordering approach in Section |4!4l but in general, any strict total or- 
der can be used without affecting the space and time complexities 
of AH. For our discussion that follows, it suffices to know that less 
important nodes tend to precede more important nodes in our strict 
total order. For convenience, we define a rank for each node in G, 
such that a node u ranks lower than another node v, if (i) v is at 
a higher level than u, or (ii) u and v have the same level, but u 
precedes v in the strict total order. 

AH constructs shortcuts in H* in an incremental manner sim- 
ilar to the algorithm for deciding node levels. In particular, it 
first inspects G, and inserts into H* a set of shortcuts that con- 
cern level-0 nodes. After that, it reduces G to a smaller graph Gi. 
Subsequently, it recursively reduces G* into another graph G*_|_i 
(i € [l,h — 1]), during which it constructs shortcuts that concern 
nodes at the i-th level of H* . In the following, we will elaborate 
the reduction from G* to G*+i {i G [0, /i — 1]), assuming Go = G. 
For convenience, we define the level of every edge in Gq as — 1. 

Given G* , AH first imposes the grid Ri+i on G* . For each node 
u £ Gi, AH inspects the (5x5)-cell region G centered at u, as 
well as the subgraph of G* that consists of any level-(i— 1) edge 
overlapping with G. Then, AH computes two shortest path trees 
(SPT) of the subgraph, as defined in the following: 

Definitions (Shortest Path Trees (SPT)). Let G be 
a graph, and T he a directed spanning tree of G rooted at a node 
u. T is a forward SPT of G, if T contains the shortest path from 

'if multiple edges or shortcuts in P intersect the bisector, we 
choose an arbitrary one among them as the pseudo-arterial edge. 



u to any node in G. On the other hand, if T contains the shortest 
path from any node in G to u, then T is a backward SPT ofG. 

Let Tf (resp. T^) be the forward (resp. backward) SPT of the 
aforementioned subgraph that is rooted at u. Observe that Tf (resp. 
Tfc) can be computed by one traversal of the subgraph using Dijk- 
stra's algorithm. Let v be any node in T/, such that u ranks lower 
than V but higher than any ancestor of v in Tf. For any such v, AH 
generates a shortcut {u, v) with a length equal to the distance from 
It to u in Tf. In addition, AH associates {u, v) with a node it; on 
the path from it to t;, such that w ranks higher than any node on 
the path except it and v. This is to indicate that, when answering 
shortest path queries, AH can replace (it, v) with a two-hop path 
{u, w, v). (Our algorithm guarantees that such a two-hop path al- 
ways exists.) Similarly, for any node v in Tb, AH creates a shortcut 
(i;, It), if It's rank is lower than v's but higher than those of v's an- 
cestors in Tfc. Furthermore, the shortcut is associated with the node 
It;' that ranks the highest among n's ancestors except u. We refer to 
the shortcuts constructed above as level-i edge.Q Intuitively, these 
shortcuts connect each level-i node it directly to its nearby higher- 
rank nodes. By following these shortcuts during query processing, 
AH can avoid visiting less important nodes, which helps improve 
efficiency. 

Besides the level-i edges, AH creates a shortcut from it and to a 
node i; in Tf if (i) u and v are both at level i or above, and (ii) all 
ancestors of v except it are below level i. Likewise, if Tj, contains 
a node v with a level at least i, such that it is the only ancestor 
of V at level i or higher, then AH generates a shortcut from v to 
It. These shortcuts are to ensure that G* would remain connected 
when we reduce G* by removing some nodes below level i, as will 
be clarified shortly. 

When 1 > (i.e., G* is produced from a previous reduction 
step), AH also generates some extra shortcuts (referred to as elevat- 
ing edges), in a manner slightly different from the construction of 
level-i edges. First, AH inspects each node u in G* at a level lower 
than i, and it examines the (5x5)-cell region G in Ri that is cen- 
tered at It. Then, AH constructs a subgraph of G* that comprises 
of all level-i edges covered by C, as well as all edges that connect 
u with any node at level i or above. After that, AH computes the 
subgraph's forward and backward SPTs rooted at it. Let Pf be any 
path in the forward SPT that connects it to a node outside of C, 
and let v be the first node on Pf at level i or above (our algorithm 
ensures that such v always exits). AH constructs a shortcut from 
(it, v), and associates it with the node that immediately follows it 
on Pf, ifu is below level i — 1. On the other hand, if it is at level 
i — 1, then the shortcut is associated with the first node on Pf that 
ranks higher than it. This shortcut is constructed to enable AH to 
efficiently traverse from it to the i-th level of H* . Similarly, if the 
backward SPT contains a path Pb that links it with a node located 
beyond G, AH creates a shortcut {v,u), where v is the node closet 
to It on Pb among those at level i or above. If it is below level i — 1, 
the shortcut is associated with the node that immediately precedes 
It on Pb', otherwise, it is associated with the node that is closest to 
It on Pi, among those with higher ranks than it. 

Once all level-i edges and elevating edges are created, they are 
inserted into both G* and H* . After that, AH reduces G* by re- 
taining only (i) the border nodes in Ri+2 and (ii) nodes at level i or 
above. The resulting graph is defined as G*+i and is fed into the 
next reduction step. 

4.3 Query Processing 

^If multiple shortcuts are constructed from one node u to another 
node V, AH retains only the shortest one. 



The query processing algorithm of AH is similar to that of FC. 
In particular, for distance query from a node s to another node t, 
AH also answers the query with two traversals of the node hier- 
archy H* starting from s and t, respectively. As with the case of 
FC, each traversal of AH is performed with a constrained version 
of Dijkstra's algorithm. However, the constraints adopted by AH 
are slightly different: It adopts the proximity constraint (see Sec- 
tion|3.2t and a rank constraint as follows: 



• Rank Constraint: When the traversal from s (resp. t) visits a 
node u, it ignores any neighbor of u that ranks lower than it. 

Intuitively, the rank constraint is a refined version of the level con- 
straint, in that it takes into account not only the levels of nodes but 
also the strict total order defined on each level of H* . It leads to 
higher query efficiency as it helps AH bypass a larger number of 
relative unimportant nodes during query processing. 

In addition, AH also exploits the elevating edges in H* (see Sec- 
tion l4.2t to reduce query cost, based on the following lemma: 

Lemma 3. For any two nodes u,v € G, if no (3x3)-cell re- 
gion in Ri (i £ [1, K\) can cover u and v simultaneously, then the 
shortest pathfrom utov must go through a node at level i or above. 

Let Rj (j G [1, h]) be the coarsest grid where no (3x3)-cell region 
contains both s and i. By Lemma [3] the shortest path from s to t 
should pass through at least one node with a level at least j. This 
indicates that AH's traversal from s would meet its traversal from 
t at level j or above. Therefore, if s is a border node in Rj (in 
which case s has elevating edges to level j), then when we start the 
traversal from s, we can follow the elevating edges of s to move 
directly to level j, ignoring any edge that connects s to a node at a 
level lower than j. After that, we can continue the traversal from 
level j under the rank and proximity constraints. 

More generally, for any level-i (i < j) node v visited in the 
traversal from s, if n is a border node in Rj, then we move along 
the elevating edges of v to level j or above, omitting any other 
edges of v. On the other hand, if u is a border node in Rji instead 
of Rj (j' < j), then we follow the elevating edges v to level j' or 
higher, i.e., we traverse as close to level j as possible. Meanwhile, 
if V does not have any elevating edges or u is at a level at least j, 
then we traverse the edges of v that satisfies the rank and proximity 
constraints. The same strategy is used when AH traverses from t. 
This traversal strategy reduces query time, since it enables AH to 
avoid visiting the low levels of node hierarchy H* . 

So far we have only discussed distance queries. For any shortest 
path query from s to t, AH first treats it as a distance query and 
computes the shortest path P' from s to t in H* . After that, AH 
recursively replaces each shortcut in P' with its corresponding two- 
hop path, which converts P' to the actual shortest path from s to f 
in G, as explained in Section|4T| 

4.4 Node Ranking and Selection 

As mentioned, the shortcut construction algorithm of AH as- 
sumes that there is a strict total order on the nodes in the same 
level. While any strict total order can be used without affecting 
the asymptotic bounds of AH, we have found a heuristic order- 
ing approach that leads to high practical performance. Specifically, 
for nodes in the 0-th level of the node hierarchy H*, we adopt a 
random order; for nodes in the i-th level (i G [1, ^]) of H*, we 
derive their ordering based on information from the preprocessing 
procedure of AH. To explain, recall that AH decides node levels 
by recursively applying a reduction procedure on the road network 
G. During the i-th reduction step (i £ [1,/i — 1]), AH exam- 
ines a graph that contains level-(i— 1) cores; It identifies a set Si 



of pseudo-arterial edges in the graph, marks the endpoints of those 
edges as level-i cores, and then assigns all unmarked level-(i— 1) 
cores to the (j— l)-th level of H* . 

We observe that the edges in Si are connected to some extend, 
and there are some level-i cores that serve as hub nodes for the 
connections (i.e., they are adjacent to a sizable number of edges in 
Si). Intuitively, those hub nodes are more important than the rest 
of the level-i cores. Motivated by this, we order the level-i cores 
using a vertex cover approach: we inspect the graph formed by the 
edges in Si, and we compute a vertex cover of the graph using the 
linear-time 0(logn)-approximation algorithm |7|. The output of 
the algorithm is a sequence ^ of nodes in the graph, such that the 
i-th node « in ^ is adjacent to the largest number of edges that are 
disjoint from the first i — 1 nodes. Based on ^, we order the level-i 
cores as follows: The i-th node in ^ is given the i-th highest rank, 
and the level-i cores not in ^ are given the lowest ranks arbitrarily. 

Interestingly, we find that if a level-i core does not appear in 5, 
then we can downgrade it to a level-(i— 1) core without affecting 
correctness or asymptotic performance of AH. Such downgrading 
reduces the number of high-level nodes in the node hierarchy H*, 
which in turn improves query efficiency, since the high levels of H* 
are frequently traversed during query processing. Our implemen- 
tation of AH adopts this downgrading approach to improve query 
performance. 

4.5 Space and Time Complexities 

To establish the space and time complexities of AH, we first in- 
troduce a lemma that quantifies the densities of nodes in each level 
of AH's node hierarchy H*: 

Lemma 4. Any (axaj-cell region in Ri contains 0{a^X^) 
nodes whose level in H* are no lower than i, where A is the ar- 
terial dimension ofG. 

Our proof of Lemma |4] is similar to that of Lemma [T] We first 
show that any (a x a)-cell region B in Ri contains the endpoints of 
O(a^A) arterial edges in G. After that, we prove that there exists 
a one-to-many mapping from the arterial edges in G to the nodes 
in B with levels at least i, such that each edge is mapped to 0(A) 
nodes. Based on this, we show that any (axa)-cell region in Ra 
contains only 0{o?\^) nodes at level i or above. 

Space Overhead. Given Lemma|4] we can prove that each node in 
H* has 0{h\^) elevating edges and O(A^) non-elevating edges. 
This is because, by the preprocessing algorithm of AH, there is an 
elevating edge from a node m to a level-i node v, only if u and v are 
contained in the same (4x4)-cell region in Ri. By Lemma|4l there 
exist O(A^) such level-i nodes. Since H* contains only h-\- 1 lev- 
els, the total number of elevating edges adjacent to u is 0{h}?). 
Similarly, we can prove that each node in H* has O(A^) non- 
elevating edges. Therefore, the space overhead of AH is 0{hn\^), 
which reduces to 0{hn) when A is a constant. 

Query Time. AH answers any distance query with two traversals 
of H*, starting from the source s and destination t of the query, re- 
spectively. Due to the proximity constraint, in the i-th level of H* 
(i G [0, h]), each traversal of AH only visits the nodes in a (5x5)- 
cell region in Ri+i. By Lemma [4] such a region only contains 
O(A^) nodes at level i. Hence, the total number of nodes traversed 
by AH is 0{h\^). Furthermore, for each node v visited during a 
traversal, AH either follows the elevating edges of w to a certain 
level of H* , or moves along the non-elevating edges of v that sat- 
isfy the rank and proximity constraints. As previously discussed, v 
has O(A^) elevating edges to each level of H*, and has O(A^) non- 
elevating edges. Therefore, the total number of edges visited by AH 
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is 0(/iA'*). Since each traversal is performed using Dijkstra's al- 
gorithm, its overall time complexity is 0(/iA^ log(/iA^) + /lA*). 
Consequently, when A is a constant, the time complexity of AH for 
a distance query is 0{h log h). 

To answer a shortest path query from s to i, AH first processes its 
corresponding distance query to retrieve the shortest path P' from 
s to i in H* , and then it transforms P' into the actual shortest path 
P from s to i in G. The transformation from P' to P takes 0{k) 
time, where k is the number of edges in P. Therefore, AH requires 
0{k + h log h) time to answer a shortest path query. 

Preprocessing Cost. The preprocessing algorithm of AH consists 
of three steps: (i) assigning nodes to each level of H* , (ii) deriving 
the strict total order on nodes at the same level, and (iii) creating 
shortcuts m H* . When assigning nodes to the i-th level of H* 
{i £ [0,h — 1]), AH inspects each non-empty (4x4)-cell region 
in Ri+i, and construct a subgraph that consists of the level-i cores 
and border nodes in the region. For each node in the subgraph, 
AH needs to apply Dijkstra's algorithm to traverse the subgraph a 
constant number of times. Given Lemma |4] and the fact that each 
level-i core or border node (i) has 0{X^ ) edges and (ii) is contained 
in a constant number of (4x 4)-cell region in Ri+i, it can be proved 
that AH requires O(n^A^) time to assign nodes to level i of H*. 
Meanwhile, AH takes only 0{n) time to derive the strict total or- 
der at level i of H* , since the derivation is based on a linear time 
algorithm for vertex cover. 

To construct shortcuts at the i-th level of H* , AH needs to in- 
spect a graph G* reduced from G. For each node it in G* , AH 
examines the a (5x5)-cell region in Ri+i that is centered at it, and 
it creates shortcuts for u by traversing the nodes in the region at 
level i or above. Based on Lemma|4] it can be proved that the to- 
tal cost of generating shortcuts for u is O(A^). As such, the time 
required to create shortcuts at level i of H* is O(nA^). 

Summing up the above analysis, we have the following theorem: 



Theorem 1 . Given a road network with a constant arterial di- 
mension, AH takes 0{hn^) time to construct an index that requires 
0{hn) space. With the index, AH answers any distance query in 
{h log h) time and any shortest path query in 0{k + h log h) time, 
where k is the number of edges in the shortest path. 



5. RELATED WORK 

Numerous techniques (e.g., li4HSl[8l [TQ|424 1) have been proposed 
for processing shortest path and distance queries on road networks. 
Many of these techniques focus on practical performance, and they 
are mostly heuristic-based. For example, ALT |12| pre-computes 
the road network distances from each node to a fixed set of nodes 
(referred to as landmarks), and then utilizes those pre-computed 
distances to reduce the search space of each query. Hiti ||17 1 par- 
titions the road network into vertex-disjoint subgraphs, and then 
pre-computes the shortest paths that connect different subgraphs to 



facilitate query processing. We refer the reader to 0251 for a survey 
of the existing heuristic-based techniques. 

In addition, there also exists a large number of worst-case 
efficient algorithms for shortest path and distance queries (see 
(H[ll[T8l[l9l|2Tl|23l and the references therein). Most of these 
algorithms assume that the road network is a planar graph with non- 
negative weights, while some recent work Il4l l21l|23l adopts more 
subtle assumptions on the road network to derive tighter bounds on 
space and time complexities. Table[T]lists the performance bounds 
of several most recent algorithms. Compared with the state of the 
art, our method offers superior query efficiency while incurring 
moderate costs of space and pre-computation. 

The work most related to ours is by Bast et al. |5|, Abraham 
et al. (41, and Geisberger et al. Hill . Bast et al. [|5| observe that, in 
practice, there often exist a small set S of nodes in the road network 
(referred to as transit nodes), such that any shortest path connect- 
ing two distant locations must pass through at least one node in 5*. 
Based on this observation. Bast et al. propose a heuristic solution 
for answering shortest path and distance queries. However, the pro- 
posed solution is shown to be flawed in that it may return incorrect 
query results 1251 . Our notion of arterial dimension is motivated by 
Bast et al.'s observation, but our definition of arterial edges is con- 
siderably different from Bast et al.'s formulation of transit nodes. 

Abraham et al. |4| introduce a theoretical abstraction of Bast et 
al.'s observation, based on which they propose several worst-case 
efficient algorithms for shortest path and distance queries. The pro- 
posed algorithms adopt an assumption that is similar in spirit to our 
Assumption[T] but is more elegant in a theoretical sense. Neverthe- 
less, the assumption adopted by Abraham et al. has not been tested 
on any real road networks, while our Assumption [T]is backed by 
empirical evidence from real datasets, as shown in Section |2] Fur- 
thermore, Abraham et al.'s algorithms require pre-computing the 
shortest path between any pair of nodes in the road network, which 
renders them inapplicable even for moderate-size datasets. 

Geisberger et al. Ill] propose a road network index called the 
Contraction Hierarchies, which (i) heuristically imposes a total or- 
der on the road network nodes and (ii) constructs shortcuts from 
low-rank nodes to high-rank nodes to enable efficient query pro- 
cessing. Our AH method is inspired by CH, and it outperforms CH 
in terms of both asymptotic and practical performance, as will be 
shown in Section|6] 

6. EXPERIMENTS 

This section experimentally compares our AH method with three 
techniques: (i) Dijkstra's algorithm |9|, (ii) Spatially Induced Link- 
age Cognizance (SILC) 121], one of the most advanced worst-case 
efficient indices for shortest path and distance queries, and (iii) 
Contraction Hierarchies (CH) 1 1 1 1, a heuristic approach that offers 
the highest overall efficiency in shortest path and distance queries 
while incurring minimal costs of space and pre-computation, as 
shown in a recent experimental study |25 1 of the state of the art. 
We implement AH and Dijkstra's algorithm using C-l~l-, and we ob- 



Table 2: Dataset Characteristics 



Name 


Corresponding Region 


Number of Nodes 


Number of Edges 


DE 


Delaware 


48,812 


120,489 


NH 


New Hampshire 


115,055 


264,218 


ME 


Maine 


187,315 


422,998 


CO 


Colorado 


435,666 


1,057,066 


EL 


Florida 


1,070,376 


2,712,798 


CA 


California and Nevada 


1,890,815 


4,657,742 


E-US 


Eastern US 


3,598,623 


8,778,114 


W-US 


Western US 


6,262,104 


15,248,146 


C-US 


Central US 


14,081,816 


34,292,496 


US 


United States 


23,947,347 


58,333,344 



tain the C++ implementations of SILC and CH from 1 1 2|. All 
experiments are conducted on a 64-bit windows machine with an 
Intel Xeon 2.8GHz CPU and 32GB RAM. 

6.1 Datasets and Queries 

We use ten publicly available datasets (3), each of which cor- 
responds to a part of the road network in the US. Table |2] shows 
the number of nodes and edges in the data. For each edge in the 
datasets, its weight quantifies the time required to traverse the road 
segment that is represented by the edge. 

Following previous work 1251 . we generate ten sets of queries 
Qi,Q2, ■ ■ ■ , Qio on each dataset as follows. We first estimate the 
maximum network distance Imax between two nodes in the road 
network. After that, we insert 10000 pairs of nodes (s, t) into Qi 
{i £ [1, 10]) as queries, such that the distance between s and t is in 
[2'~^^ -Imax, 2*"^'' -Imax)- lu Other words, the network distance 
between any pair of nodes in Qi is larger than that in Qi-\. 

6.2 Efficiency for Distance Queries 

Our first set of experiments focus on distance queries. Figure[8^ 
shows the average running time of each technique when answering 
the distance queries in Qi (i G [1, 10]) on the DE road network 
(which contains 48,812 nodes). Observe that AH consistently out- 
performs all competitors including CH, the state-of-the-art heuris- 
tic approach. In particular, on query sets Qg, Qg, and Qio (where 
each queiy concerns two distant locations), AH's running time is 
lower than that of CH and SILC by more than 50%. CH performs 
slightly worse than SILC on Qi, Q2, • • • , Qe, but it is evidently su- 
perior to SILC on Qs,Q9, and Qio. Dijkstra's algorithm incurs the 
highest computation overhead on all query sets. 

Figure [8j) shows the query processing time of each method on 
NH, which is about 2 times the size of DE. Again, AH is consis- 
tently more efficient than the other three techniques, especially on 
query sets Qg, Qg, and Qio. CH suppresses SILC in most query 
sets, which contrasts the case on DE where CH only dominates 
SILC on Qg, Qg, and Qio. This indicates that SILC does not scale 
as well as CH. Dijkstra's algorithm is still the least efficient one 
among the four techniques. Similar results are shown in Figure [§}; 
andP. 

Figures[8^ -[8] show the running time of AH, CH, and Dijkstra's 
algorithm on the largest six datasets. (SILC is omitted since its 
preprocessing and space overheads on those these datasets are pro- 
hibitive, as will be shown in Section [6l4l l The relative performance 
of AH, CH, and Dijkstra's algorithm remain the same as in Figures 
[8^ -[8}l, with AH (resp. Dijkstra's algorithm) being the most (resp. 
least) efficient method by far. 

6.3 Efficiency for Shortest Queries 

Figure [8] shows the average computation time of each technique 



when answering the shortest path queries in Qi (i G [1, 10]) on 
all ten datasets. Regardlsss of the dataset, AH significantly outper- 
forms the other three techniques. SILC is superior to CH on DE, 
but the performance of the two methods becomes comparable on 
the larger datasets. Dijkstra's algorithm is the least efficient one in 
all cases. 

The running time of AH is higher for shortest path queries than 
distance queries. This is because, when answering a shortest path 
query from a source s to a destination t, AH first (i) computes the 
distance from s to t, and then (ii) derives the shortest path based 
on the result of the distance query. As a consequence, any shortest 
path query incurs a strictly higher overhead than a distance query 
with the same source and destination. Similarly, CH also incurs a 
higher cost for shortest path queries than distance queries. 

In contrast, the running time of SILC (resp. Dijkstra's algorithm) 
is identical in Figures [8] and |9] The reason is that, SILC (resp. Di- 
jkstra's algorithm) answers any distance query by first deriving the 
shortest path P from the source to the destination, and then return- 
ing the length of P. Computing the length of P incurs only neg- 
ligible overhead, which explains why the costs of distance queries 
are the same as that of the shortest path queries. 

6.4 Space and Preprocessing Costs 

In the last sets of experiments, we evaluate the space and pre- 
computation overheads of AH, SILC, and CH. (We omit Dijkstra's 
algorithm as it does not require building an index on the road net- 
work.) Figure[T0b illustrates the index space required by AH, SILC, 
and CH on each dataset. Although SILC is worst-case efficient, its 
space overhead is extremely high, and it increases super-linearly 
with the number of nodes n in the road network. In particular, for 
all datasets with more than 500, 000 nodes, the index of SILC is 
more than 32GB in size, i.e., it cannot fit in the main memory of 
our machine. For this reason, we omit SILC from the experimen- 
tal on those datasets. Meanwhile, the space consumption of AH is 
moderate, and it increases linearly with n. This is consistent with 
our analysis in Section l431 that AH incurs a linear space complex- 
ity. Lastly, CH is the most space-economic method: it requires no 
more than 2GB of space even for the largest dataset. 

Figure [Tob shows the time required by AH, SILC, and CH to 
construct indices on our datasets. Observe that SILC has a pre- 
computation cost super-linear to n, and it requires more than one 
week to preprocess even the relatively small dataset CO (which 
contains 435, 666 nodes). In contrast, the preprocessing time of 
AH exhibits a linear increase with n, even though AH's index con- 
struction algorithm has an 0{hn^) time complexity. Furthermore, 
the pre-computation cost of AH is fairly small: it only requires 
around three hours to preprocess the US road network with 23 mil- 
lion nodes. On the other hand, the pre-computation time of CH is 
minimum and is below 40 minutes for all datasets. 

7. CONCLUSION 

This paper presents Arterial Hierarchy (AH), a worst-case effi- 
cient index structure for shortest path and distance queries on road 
networks. Under a practical assumption about the road network, 
AH offers superior query time complexities in both shortest path 
and distance queries, and its space and preprocessing time com- 
plexities are comparable to the best existing worst-case efficient 
methods. With extensive experiments on real datasets, we show 
that AH also provides excellent query efficiency in practice, and it 
even outperforms CH (i.e., the state-of-the-art heuristic method) in 
terms of query time. Furthermore, the space consumption and pre- 
processing cost of AH are fairly small: It takes only around three 
hours to preprocess a continent-scale road network with 23 million 
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Figure 8: Efficiency of distance queries vs. query set. 



nodes, and the resulting index stiiicture is no more than 32GB in 
size. For future work, we plan to extend AH for the scenarios when 
(i) the weight of each road network edge may change with time 
(e.g., due to traffic conditions) and (ii) the memory footprint of the 
index structure is a significant concern (as is the case for mobile 
devices). 
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APPENDIX 



VIA 



A. UNIQUE SHORTEST PATHS 
WEIGHT PERTURBATION 

Let h be as defined in Section [3T1 The solutions in the paper 
rely on the following assumption: 

Assumption 2. For any (AxAycell region B in the square 
grid Ri (i £ [0, h]), there do not exist two local shortest paths 
in B that share the same endpoints and have the same length. 

In the section, we show that Assumption [2] can be enforced by 
adding a small perturbation to the weight of each edge in the road 
network G. Specifically, we associate each edge e in G with an 
integer p(e) that is randomly selected in the range [0, r — 1], where 
r is a parameter to be specified shortly. We refer to p(e) as the 
nuance of e, and we define the nuance of a path P as the sum of the 
nuance of the edges on the path, denoted as p{P). For any two path 
Pi and P2 such that l{Pi) = 1{P2), we consider Pi shorter than 
P2 if p(Pi) < p(P2). We will establish the following theorem. 

Theorem 2. Let A be the largest degree of any node in G. If 
T > 32hn'^(^^y then Assumption\2\holds with a probability at least 
1-1. 

n 

In other words, by setting r to a sufficiently large value, we can 

ensure that Assumption|2]holds with an overwhelming probability. 

Our proof of Theorem |2]is based on a few lemmas as follows. 

Lemma 5. Let P and P' be two paths in G. Then, p{P) = 
p{P ) occurs with at most 1/t probability. 

Proof. Assume that P = (ei, . . . , Ci) and P' = {e'l, . . . , e'j). 
Then, 

Pr{p(P)=p(P')} 

I i<fc<i i<fc<:j J 

= Pr i p(ei) = J2 P(^'^) - E ^(^^) [ 
= E lPr{piei) = x} 

0<a;<T-l \ 

■pA J2 p^^'o) - E ^(^') = ^ 1 ) 

I l<fc<j 2<k<i J / 

- E f^-^^E p(4)- Ep(^o-4) 

0<a;<T-l \ \l<k<j 2<k<i J / 

- 7- E pA E ^(4)- E p(^-) = 4 

0<i<T-l I l<fc<j 2<k<i J 



D 



Let A be the maximum degree of any node in G. Based on 
Lemma[5] we have the following result: 



Lemma 6. Let B be a (4x4:)-cell region in Ri (i G [0, h]). For 
a node s in B, let ( (resp. (^') be the event that there exists a another 
node V, such that the local shortest path from s to v (resp. from v 
to s) in B is not unique. Then, Pr{C, V (^'} < (2) ■ 2n/r. 

Proof. We will prove that Pr{C,} < ('^) • n/r. By symme- 
try, it can also be shown that Pr{(^'} < (2) ■ n/r, leading to 
Pr{CvC'}<(t)-2n/r. 

Let ds (v) be length of the the local shortest path distance from 
s to 1; in B. Let (uoiWi, . . . ,Vk) be a permutation of all the 
nodes that can be reached from s via local paths in B, such that 
ds (vi ) < ds (vj ) for any < i < j < k. That is, i;o , ''i , ■ • • , f/t 
are sorted in a non-decreasing order of their distances from s. Note 
that vo = s. Let ^i (i G [1, fc]) be the event that (i) the local shortest 
path from s to any Vj (j G [0, i — 1]) in P is unique, but (ii) the local 
shortest path from s to Vi is not unique. We have Pr{(^i} = 0; oth- 
erwise, there must exist another node u such that ds{u) < ds{vi), 
contradicts the definition of Vi (i G [0, k]). In addition. 



Pr{C} = Pr\ U C 



Pr{Ci} + Pr^ U C. 



Pr 



U c 



Now let us consider Pr{C} for i G [2,n - 1]. Let Vs.vi = 
{Pi, . . . , Pg} be the set of local shortest paths from s to Vi in B. 
For each Pj (j G [1, g]), let {uj ,Vi)he the last edge on P. Then, Uj 
should be in {i;o,''i, ■ • ■ ,fi-i}. By the definition of i^i, the local 
shortest path from s to Uj in B is unique. Furthermore, g < A, 
since the degree of Vi is at most A. By Lemma|5] we have 

PriC) < J2 Pr{p{P,) = p{Pk)} 



l<j<k<q 

< E ■ 

l<j<fc<q 



1 

m 



1 iq 

m \2 



Therefore, 



Pri U 

I 2<i<n-l 



< E p^i^^} 



which completes the proof. D 

Given Lemma|6l we prove Theorem|2]as follows: 
Proof Of Theorem^ Let s be an arbitrary node in G. For 
any Ri (i G [0, h]), there exist at most 16 (4x4)-cell regions in Ri 
that contains s. By Lemma[6l for each B of those 16 regions, there 
is at most ( 2 ) ■ n/r probability that B contains non-unique local 
shortest paths between s and and another node. Taking in account 
all possible choices of B in all Ri and all possible choices of s, 
Assumption|2]fails with a probability at most 



■ 32n/r ■ h ■ n 



) ■ 32n'^h/r. 



By setting r > 32n h(^^^, we can guarantee that the above proba- 
bility is at most 1/n. Therefore, Theorem|2]is proved. D 

Remark. The above perturbation method requires generating ran- 
dom numbers in the integer range of [0, r — 1], which causes prac- 
tical concerns since r — 1 can be too large to be represented with 
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Figure 11: Bi is completely contained in the 

{{a+6) X (Q+6))-cell region. 



Figure 12: £2 is not completely contained in the 

((a+6) X (Q+6))-cell region. 



Algorithm SlidingWindow ( P = {vi,V2, ■ ■ ■ ,Vf^), Ri ) 

1 . Let B be a set that contains any region B consisting of a consecutive 



10. 
11. 



12 



block oi X X y cells in Ri {x,y > 0). 
Initialize 8 = 0. 
For j = 1 to fc 

d = j- 

Let Bj be the smallest region in B that covers vi,V2, ■ ■ ■ ,Vj 

simultaneously. 

if Bj is at least 4 cells in width or height, then break. 
If Bg is at least 4 cells in width 

Let Va and Vfj be the nodes in {vi, V2, ■ ■ ■ ,vg} with the smallest 

and largest x-coordinates, respectively. 

Let B be any (4 X 4)-cell region in Ri such that (i) B covers vi 

V2, ■ ■ ■ , vg^i simultaneously, and (ii) Va is in the west strip of B. 
Else 

Let Va and Vfi be the nodes in {vi,V2, ■ ■ ■ ,vg} with the smallest 

and largest y-coordinates, respectively. 

Let B be any (4 X 4)-cell region in Ri such that (i) B covers tii 

V2, ■ ■ ■ , vg_i simultaneously, and (ii) Va is in the south strip of B. 

13. Let a = min{Q:, /3}, and b = maxja, f}}. 

14. Return B and P' = (va,Va+i, ■ ■ ■ ,vi,). 



Figure 13: The SlidingWindow Algorithm 

a normal integer. We address this issue by using multiple random 
integers in a relatively narrow range to represent r. In particular, to 
generate the nuance for an edge, we can use k random integers in 
the range of [0, r' — 1], where r' = t* . Accordingly, the nuance 
on each edge would be a fc-dimensional vector. It can be verified 
that, under such edge perturbation, the results in this section still 
hold. 

B. THE SLIDINGWINDOW ALGORITHM 

This section presents an algorithm called SlidingWindow, which 
will be used in proving the key lemmas in the following sections. 
Let P be a shortest path in G, such that no (3 x 3)-cell region in Ri 
{i £ [1, h]] can cover all nodes in P simultaneously. Given P and 
Ri, the SlidingWindow algorithm identifies a (4x4)-cell region B 
in Ri, such that B has a spanning path P' that is a sub-path of P. 
Figure[T3]shows the pseudo-code of SlidingWindow. 

Given the grid Ri and a path P = {vi,V2, . . . ,Vk}, the algo- 
rithm scans the nodes in P one by one (Line 3-6 in Figure I14t . 
Each time it scans a node Vj in P, it computes a minimal rectangu- 
lar region Bg (in Ri) that contains all nodes vi,V2, ■ ■ ■ ,Vj that have 
been visited (Line 5). If Bg is at least 4 cells in width or height, 
then the path {vi,V2, ■ . ■ ,Vj} must contain a sub-path P' that is 
a spanning path of some (4 x 4)-cell region B. To derive such a 
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Figure 14: Illustration of the SlidingWindow algorithm. 



region B, the algorithm inspects the nodes in {vi ,V2,...,Vj}, and 
then identifies two nodes Vc, and Vfj as the endpoints of the sub- 
path P' (Lines 7-8). Based on Vc, and vp, the algorithm identifies 
B (Line 12), and then returns B and P' as the result (Lines 13-14). 
For example, let us consider the path P = {vi,V2, . . ■ ,ve) 
shown in Figure[T4] Given P and Ri, the SlidingWindow algorithm 
examines the nodes in P one by one, and monitors the minimal 
rectangular region Bg (in Ri) that covers all nodes visited. Fig- 
ure[T4]illustrates the region Bg right after 115 is visited. As Bg is 5 
cells in width, the algorithm stops the examination procedure, and 
identifies the nodes with the smallest (resp. largest) x-coordinate in 
{vi, V2, ■ ■ ■ , V5}, i.e., V2 (resp. V5). After that, the algorithm de- 
rives the (4x4)-cell region B in Figure [141 such that (i) B covers 
wi , i'2 , «3 , i'4 simultaneously, and (ii) V2 is in the west strip of B. 
Finally, the algorithm returns B and the path P' = {v2,V3,V4,, U5). 

C. PROOFS OF LEMMAS 1 AND 2 

Lemma 1. Any (axa)-cell region in Ri contains 0{a^\) 
level-i nodes in H, where X is the arterial dimension ofG. 

Proof. Consider the (ax a)-cell region A in FigurefTTI as well 
as the ((a+6) x (a+6))-cell region that is centered at A. The 
level-i nodes that fall in A can be categorized into two overlapping 
groups. The first group contains the endpoints of the arterial edges 
for a region Bi completely covered by the ((a+6) x (a+6))-cell 
region, and the second group consists of the endpoints of the arterial 
edges for a region (4x4)-cell region B2 that is disjoint from A. 

The number of nodes in the first group is at most 2 ■ (a + 3)^ • A 
nodes. This is because (i) the number of (4x4)-cell regions con- 
tained in the ((q+6) x (Q+6))-cell area is (a + 3)^, and (ii) each 
(4x4)-cell region has A arterial edges. Meanwhile, all nodes in the 
second group also appear in the first group. To explain, observe that 
for any node u in A and any (4x4)-cell region B2 that is disjoint 
from A, if 11 is the endpoint of an arterial edge for B2 , then the edge 
must (i) connects it to a node v in B2 and (ii) lies on a spanning 



path P of B2- It can be verified tiiat thiere should exist a (4x4)-cell 
region B in A, such that B contains u, and P is a spanning path 
of B, as exemplified in Figure[T2] In that case, the edge between u 
and V would also be an arterial edge for B. In other words, the node 
u is also contained in the first group mentioned before. As a con- 
sequence, the total number of level-i nodes in A is 2 ■ (a + 3)^ ■ A, 
which proves the lemma. D 

Lemma 2. Let P be a shortest path in G, such that no (3x3J- 
cell region in Ri (i G [1, /;.]) can cover all nodes in P simultane- 
ously. Then, P must contain an arterial edge of some (AxAj-cell 
region in Ri. 

Proof. By the SUdingWindow algorithm, we can always find a 
(4x4)-cell region (denoted as B) in Ri, such that a sub-path of P 
(denoted as P') is a spanning path of B. Whenever such a region B 
exists, P must contain an arterial edge for B, and hence, the lemma 
is proved. D 

D. PROOF OF LEMMA 3 

In this section, we first revisit the preprocessing algorithm of 
AH, based on which we present the proof of Lemma|3] 

D.l Preprocessing Algorithm Revisited 

Given a road network G, AH selects level i cores (the nodes that 
are at least at level i) in an incremental manner. At the i-th iteration, 
AH performs two steps: (i) it computes the spanning paths so as to 
select the level-i cores, and (ii) it adds shortcuts concerning the 
boarder nodes in Ri+i and the level-i cores to obtain a reduced 
graph G'i for the next iteration. 

More specifically, in the first step, each original edge {u, v) £ G 
is considered an edge generated from a region B, if u is in B. 
Then, as for each iteration, AH selects level-(i+l) cores in the fol- 
lowing manner. First, AH imposes the grid Ri+i on G'i (note that 
Go = G). Then, for each region B in Ri+i, AH inspects each 
sub-graph of G'i that overlaps with B, denoted as G'iB- For each 
boarder node s of B, AH invokes Dijkstra's algorithm to start a 
traversal on G'ig from s; for each node u visited, AH follows its 
outgoing edges (it, v) such that (i) u is a level-i core, and (ii) {u, v) 
satisfies the coverage condition. This results in a spanning tree 
Ts. Subsequently, for each node u on Ts, AH inspects its outgo- 
ing edges {u,t) in G'i g, such that (i) {u,t) satisfies the coverage 
condition, and (ii) i is a boarder node in B, or t is not in B and t 
is a level-i core. As such, a path from s to t is obtained. Similarly, 
AH invokes Dijkstra's algorithm to start a traversal from s again, 
but with the difference that for each node visited, AH follows its 
incoming edges. We use Pi+i^s to denote the paths thus obtained. 
We will prove that each path P G Pi+i,B is a spanning path of B 
in Lemma[8] based on Lemma|7]below. 

After the level-(i+l) cores are selected, AH adds shortcuts con- 
cerning the boarder nodes of Ri+2 and the level-(j+l) cores to 
form G'i^i in this manner: let i? be a (4 x 4)-cell region in Ri+i, 
and u a boarder node of Ri+2 or a level-(i+l) core in B. Then, 
similar to the process of finding spanning paths, AH invokes a sim- 
ilar constrained version of Dijkstra's algorithm to start a traversal 
on G'i g from u. This results in a spanning tree Tu . Subsequently, 
AH examines each branch on Tu'. let v be the first level-(j+l) core 
on the branch. Then, AH adds {u, v) as a shortcut. Similarly, AH 
also invokes a constrained version of Dijkstra's algorithm to start 
a traversal from s again, but with the difference that for each node 
visited, AH follows its incoming edges. After that, AH adds short- 
cuts from level-(i+l) cores to u. 

For convenience, we define a few terms that will be frequently 
used in our proofs. Let _B be a region in Ri and B' be a region in 



Rj (i < j). If B is completely contained in B', we say that B is 
a sub-region of B' and B' is a super-region of B. Let P be a path 
from a node s to another node t. We say P is contained in a region 
B, if all the nodes on P are in B. Under the grid Ri, we say two 
nodes s and t are far-apart if they are not covered in the same 3x3 
cell region. 

D.2 Supporting Lemmas and Proofs 

Our proof of Lemma|3]is based on Lemma|7]and Lemma[8] 

Lemma 7. Let Bs be a region in Rj, s and t be two nodes 
in Bs, and P be the local shortest path from s to t in Bs. If no 
(4 X 4)-cell region in Ri (i < jj can cover all nodes on P, then: 

L There is a (Ax 4)-cell sub-region ofBs in Ri (denoted as B), 
that makes a sub-path of P (denoted as P') a spanning path 
ofB. 

2. Either (i) P' is contained in B, or (ii) P' is not fully contained 
in B, and the last two nodes on P are not in two adjacent 
cells in Ri. 

Proof. First, we can obtain B and P' using the SUdingWin- 
dow algorithm in Figure [13] with P and FU as the input. Sec- 
ond, we show that using the SUdingWindow algorithm, we can find 
B which is a sub-region of Bs. Without loss of generality, sup- 
pose that Bg is at least 4 cells in width (Line 7 in Figure [14). and 
P' = {va,Va+i, ■ . ■ , vt) is a horizoutal spanning path of B, i.e., 
Va is in the west strip of B and vt is in the east strip (or to the east 
strip) of B. Suppose that Vc has the smallest y-coordinate among 
the nodes iia, Wa+i, . . . ,Vb. Then we can derive a (4 x 4)-cell re- 
gion B, where Vc is in the south strip of B. It could be verified that 
B is a sub-region of Bs because: (i) the side length of a cell in B 
is smaller than the side length of a cell in Bs , (ii) the largest differ- 
ence of the y-coordinates among the nodes Va,Va+i, . • • , vt-\ is 
less than 4, and (iii) all the nodes on P' is contained in Bs . Hence, 
Statement 1 is proved. 

Next, we show that the P' and B obtained satisfy the two condi- 
tions stated in Statement 2. Without loss of generality, suppose that 
after the iteration (Line 3-6 in Figure [14) terminates, Bg is at least 
four cells in width (Line 7 in Figure [T4) . We consider two cases: 
(i) Bg's width is exactly 4, and (ii) Bg's width is larger than 4. In 
case (i), it can be verified that P' is contained in B. In case (ii), 
apparently, the width of Bg is less than 4 before vg is visited, and 
is larger than 4 after vg is visited. Therefore, fs-i and vg cannot 
be in two adjacent cells. 

Hence, the lemma is proved. D 

Lemma 8. Let B be a (4 x 4)-cell region in Ri. Then, the 
following statements are true: 

L Each P G Vi.g is a spanning path of B. 

2. For any P G Pi.g, a path from s to t, either (a) P contains 
only one edge, and s, t are level-i cores, or (b) P contains 
more than one edge, then a node w on P with w 7^ s,t is a 
level-i core, and w is in B. 

3. Let Bs be a region in Rj (j > i), s,t be two nodes in Bs, 
and P the local shortest path from s to t in Bs. If s,t are 
far-apart in Ri, then there exists a (4 x 4)-cell region B in 
Ri, where B is a sub-region of Bs, such that P covers a path 
in Vi,B. 

4. Let Bs be a region in Rj (j > i). Let s,t be two nodes in 
Bs and P the shortest path from s to t in Bs. If s,t are far- 
apart in Ri, then P is covered by a level-i core. Further, if 
P contains multiple edges, then there is a level-i core u on P 
where u 7^ s,t. 



Proof. This lemma could be proved by mathematical induc- 
tion. 

As for Statement 1, for simplicity, we only consider the paths 
from the west strip to the east strip of B. The lemma could be 
proved true for the paths in the other directions in a similar way. 
Within a (4 X 4)-cell region B, the algorithm finds out two types of 
paths. Suppose that P is a path from s to i found by the algorithm. 
Then, either (a) P ends at a boarder node t of B, and t lies in the 
east strip of B, or (b) t is to the east of B, while t, and the prede- 
cessor of t on P, are both level-(i— 1) cores. We prove Statement 
1 respectively concerning these two types of paths. For ease expo- 
sition, we make a slight difference from the algorithm described in 
Section IdH after the level-i cores are selected, we add shortcuts 
concern all the nodes in G and the level-i cores to obtain G'i. At 
the end of the proof, we will show that if we only add shortcuts 
concerning the boarder nodes of Ri+i and the level-i cores, this 
proof also works. Furthermore, based on Statement 1, we prove 
Statements 2, 3, and 4 also hold. 

To facilitate our proof, we make the following six claims. Claims 
1.1 to 1.3 together prove Statement 1 true. Claims 1.4 to 1.6 are 
used for induction. 

1.1. Let u, V be two level-(i— 1) cores in B. Then the local short- 
est path from u to w contained in B could be found by in- 
voking a constrained version of Dijkstra's algorithm to start 
a traversal on G'i_i within B from u: for each node w vis- 
ited, the traversal only relaxes the edges {w, x), where a; is a 
level-(i— 1) core and {w, x) satisfies the coverage condition. 

1.2. Let u be a node in B and v a level-(i — 1) core in B. Then the 
shortest path from uio v contained in B could be found by a 
Dijkstra traversal as described in Claim 1.1. 

1.3. Let u be a level-(j — 1) core in B and v a node in B. Then 
the shortest path from utov contained in B could be found 
in two steps: (i) performs a Dijkstra traversal as described in 
Claim 1.1, and (ii) after all the level-(j— 1) cores in B reach- 
able from u are visited, a spanning tree from u is created; 
inspects the edges (w, v) where w is on the spanning tree and 
{w, v) satisfies the coverage condition to obtain the shortest 
path from it to u. 

1.4. Let It,?; be two level-i cores in B. Then the shortest path 
from uXo V contained in B could be found by by invoking a 
constrained version of Dijkstra's algorithm to start a traversal 
within B from u: for each node w visited, the traversal only 
relaxes the edges (w, x) where a:: is a level-i core and {w, x) 
is generated from B. 

1.5. Let u be a node in B and w be a level-i core in B. Then the 
shortest path from uto v contained in B could be found by a 
Dijkstra traversal as described in Claim 1.4. 

1.6. Let M be a level-i core in B and v a node in B. Then the 
shortest path from uXo v contained in B could be found in 
two steps: (i) performs a Dijkstra traversal as described in 
Claim 1.4, and (ii) similar to that of Claim 1.3, inspects the 
edges {w, v) where w is on the spanning tree obtained in (i), 
and {w, v) is generated from B to get the shortest path from 
u to V. 

Our proof is organized as shown in Figure [15] At the beginning, 
we show all statements and claims are true when i = 1. Subse- 
quently, assuming all statements and claims are true when i — k, 
we prove that they also hold when i = fc + 1 in an order denoted 
by the circle numbers shown in Figure [Tsl 

It could be verified that Claims 1.1 to 1.3 are true when i = 1 
given all the nodes in G are level-0 cores, and Go is composed 
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Figure 15: Proof structure. 

of the edges in G. Claim 1.4 is true when i = 1. Suppose that 
P is a local shortest path from u to v, and ti, . . . ,tj are in turns 
the level- 1 cores on P. Then the algorithm would add a shortcut 
{u,t\) (or {u,t\) is an original edge in G) whose length equals 
to the distance in Gq from u to ti, and {u,ti) is generated from 
B. So are the edges (ti, t2), ■ ■ ■ , {tj, v). Hence, Claim 1.4 holds 
when i — 1. Claim 1.5 is similar to Claim 1.4, except that the 
source node u is not necessarily a level-1 core. Since the algorithm 
also adds an edge from it to a level-1 core, similar to the case of 
Claim 1.4, Claim 1.5 also holds when i = 1. As for Claim 1.6, 
Suppose P is a local shortest path from u to v, and ti, . . . ,tj are in 
turns the level-1 cores on P. Then {tj, i;), is a shortcut (or original 
edge) generated from B, and the weight of {tj , v) equals that of the 
shortest path from tj to i; contained in B. Besides, by Claim 1.4, 
It could equally reach tj by the edges generated from B. Hence, 
Claim 1.6 holds when i = 1. Given Claims 1.1 to 1.3, Statement 1 
is true when i — 1. 

Statement 2 could be proved by the algorithm when i — 1. If 
P contains one edge, both the endpoints of P would be selected as 
a level-1 core. If P contains more than one edge, then an internal 
node of P (denoted as w), w 7^ s, t is selected as a level-1 core. By 
Statement 1, P is a spanning path of B, and w is not an endpoint 
of P. Hence, w is in B. 

As for Statement 3, when i = 1, first we show that there exists 
a (4 X 4)-cell region P in Pi, such that a sub-path of P (including 
P itself) is a spanning path of B. If P is contained in a (4 x 4)-cell 
region P in Pi, since s, t are far-apart in Pi, P is a spanning path 
of B. Otherwise, if P is not contained in any (4 x 4)-cell region in 
Pi, then, by Lemma|7] there exists a region P in Pi, which makes 
a sub-path of P a spanning path of B. Second, note that at the first 
iteration, given B all its spanning paths could be found because the 
paths are found on the original graph G. Hence, there should exist 
a region P in Pi, such that P covers a path in Pi,fl. 

Statement 4 could be proved by Statement 2 and 3. By Statement 
3, there is a (4 x 4)-cell region B in Ri, such that a sub-path P' 
of P covers a path in P^^b. If P' is exactly P, since P contains 
more than one edge, then by Statement 2, there should be a node 11 
on P other than s, i, such that 11 is a level-i core. If P' is not P, 
nevertheless where the positions of the two endpoints of P' on P 
are, by Statement 2, there should exist a node it (it 7^ s, t) on P 
and It is a level-i core. Hence, in either case. Statement 4 is true. 

The above all show that the lemma is true when i = 1. We now 
turn to the induction phase. Suppose the statements and claims 
above are true when i = fc, we show them true when i = fc + 1. 

Step ® in Figure [T5I Claim 1.1 is true when i = fc + 1 given 
Claim 1.4, Statement 2 and 4 true when i = fc. We prove the claim 



in two cases: (a) P is also contained in a sub-region B' of B, where 
B' is a (4 X 4)-ceIl region in Rk, and (b) P is not contained in any 
(4 X 4)-cell region in Rk. As for case (a), given Claim 1.4 true 
when i = k. Claim 1.1 is true because a level-fc+1 core is also a 
level-fe core. As for case (b), if P contains only one original edge 
{u, n), then (u, v) is an edge that satisfies the coverage condition, 
which follows that Claim 1.1 is true. Subsequently we consider 
the case when P contains multiple edges. If u, v are far-apart in 
Rk, then by Statement 4 when i = k, there should be a node w 
{w ^ u, v) on P and w is a level-fc core. If u, v are not far-apart 
in Rk , given the hypothesis that P is not contained in any (4 x 4)- 
cell region in R^, there should exist two nodes u' and v' such that 
u' and v' are far-apart in Rk. Again, by Statement 4, there also 
exists a node w on P (w j^ u, v) and to is a level-fc core. Since 
u, w, V are all level-fc cores, similarly we can consider the sub- 
path from u to TO (and the sub-path from to to u as well) in the 
two cases above. Because P contains a finite number of nodes, P 
could not be infinitely decomposed, i.e. we could always find a 
sub-path P' of P between two level-fc cores, such that, either (i) 
P' contains only one edge; or (ii) P' is contained in a (4 x 4)- 
cell region in Rk, and by Claim 1.4 true when i = k, P' could 
be discovered by a constrained version of Dijkstra traversal which 
relaxes the edges concerning level-fc cores. Therefore, Claim 1.1 is 
true when i = fc + 1. 

Step (2) in Figure [15] Claim 1.2 is true when i — k + 1 given 
Claim 1.1 true when i = fc + 1, and Claim 1.5 true when i — k. The 
proof of Claim 1.2 is similar to that of Claim 1.1, except that the 
source u is a level-0 core instead of a level-fc core. We can consider 
P in two cases: whether it is contained in a (4 x 4)-cell region in 
Rk or not. We focus on the sub-path from it. In this way we can 
always find a level-fc core to, so that, either the sub-path from u to 
TO is contained in a (4 x 4)-cell sub-region B' in Rk, or {u, to) is an 
original edge. In the former case, given Claim 1.5 true when i — k. 
Claim 1.2 is true when i = fc + 1. In the latter case, {u, to) is an 
edge that satisfies the coverage condition, hence. Claim 1.2 is also 
true. 

Step (3) in Figure [Ts] Claim 1.3 could be similarly proved like 
Claim 1.2 given Claim 1.6 true when i — k, and Claim 1.1 true 
when j = fc + 1. 

Step (4) in Figure [TS] Claim 1.4 could be proved true when i = 
fc + 1 given Claim 1.1 true when i = fc + 1 . Consider a shortest path 
P between two level-(fc+l) cores u, v. contained in a (4 x 4)-cell 
region B in Rk+i- By Claim 1.1 when i — k + 1, P should be 
equally found by invoking a Dijkstra algorithm to start a traverse 
which only vista the level-fc cores. Along P, let ii, i2, . . . , tj,v be 
the level-(fc-l-l) cores (note that a level-(fc+l) core is also a level-fc 
core). Then AH would add {u, t\) (if not existed in G) as a shortcut 
generated from B, and its weight equals the weight of the shortest 
path from u to ti in B. So are the shortcuts (ti, t2), • • • , {tj, v). 
Hence, Claim 1.4 is true when i = fc + 1. 

Step (5) (resp. ®) in Figure [15] Claim 1.5 (resp. Claim 1.6) 
could also be proved given Claim 1.1 true when i — fc + 1, and 
Claim 1.2 (resp. Claim 1.3) true when i — k. 

Step ® in Figure[T5] Statement 1 is true when i = fc + 1 given 
Claims 1.1 to 1.3 and Statement 4 true when i — k. First, type (a) 
path could be correctly found. Let s (resp. t) be a boarder node 
in the west (resp. east) strip of B. Then, for any pair of such s 
and t, the shortest path P from s to i contained in B is a spanning 
path of B. Besides, followed by Statement 4, P is covered by a 
level-fc core since s and t are far-apart in Rk+i (which follows that 
s and t are far-apart in Rk). Subsequently, given Claims 1.1 to 
1.3, P could be correctly found. Second, type (b) path could also 
be correctly found. Let P be a local shortest path of B from s to 



t where t is beyond B. Let {u, t) be the last edge on P, and u a 
level-fc core. Then, given Claim 1 . 1 and Claim 1 .2, the shortest path 
from s to u contained in B could be correctly found. On the other 
hand, {u, t) satisfies the coverage condition. As a result, P could 
also be correctly found. The above all shows that: (i) for a spanning 
path P of type (a) or type (b), P could be found by the algorithm 
supported by Claims 1.1 to 1.3. And (ii) every P £ Vk+i,B is a 
spanning path of B. Hence, Statement 1 is true when i = fc + 1. 

Step (8) in Figure[T5] Statement 2 is true when fc + 1. The proof 
is similar to the case when i = 1. By the algorithm, each P £ 
Vk+i.B, satisfies either condition stated in Statement 2. 

Step @ in Figure[T5] Statement 3 is true given Statement 1 true 
when i = fc + 1. If P is not contained in any (4 x 4)-cell region 
in Rk+i, by Lemma|71 there should exist a (4 x 4)-cell region B 
in Pfc+i, where P is a sub-region of Bg, such that a sub-path of 
P (denoted as P') is a spanning path of B. If P is contained in 
a (4 x 4)-cell region in P^+i, we put P' = P. In what follows, 
we show that P' covers a path in 'Pk+i,B- We consider P' in two 
cases: (a) P' is contained in B, and (b) P' is not contained in B. 
Suppose that P' is from ulov. Without loss of generality, suppose 
that u is in the west strip of B and v is in the west strip. In case (a), 
we show that there should exist two nodes u' and v' on P', such 
that the sub-path of P' from u' to v' is a type (a) spanning path in 
Pk+i,B'- Starting with u, we scan each node on P' one by one. Stop 
until the first time a node v' in the east strip of B is met. Such v' 
exists because P' ends at a node v in the east strip (v' might equal 
to v). Similarly, there should exist a node u' such that u' is in the 
west strip, and the successors of u' on P' is to the east of the west 
strip (u might equal to u too). On the other hand, the sub-path of 
P' from u to v' is a type (a) spanning path that could be found 
by the algorithm, hence, P covers a path in Vk+i,B- In case (b), 
we show that P' covers a type (b) spanning path in Vk+i,B- First, 
similar to case (a), there should be a node u' on P', such that u' is 
in the west strip, and the successors of u on P' is in to the east of 
the west strip. On the other hand, let v' be the predecessor of v on 
P'. By Lemma|7] since P' is not contained in B, the v' and v are 
not in two adjacent cells in P^+i, which follows that v' and v are 
far-apart in Rk. Hence, the sub-path of P' from u to w is a type 
(b) spanning path in Vk+i,B- 

Step (10) in Figure [Ts] Statement 4 could be proved when i = 
fc + 1 given Statements 2 and 3 are true when i = k + l. The proof 
is similar to the case when i = 1. 

Finally, we show that the Statements and Claims also hold if at 
each iteration, AH only adds shortcuts concern the boarder nodes 
of Ri+\ and the level-i cores. Let B be a (4 x 4)-cell region in 
Ri+i (i > !)• In the computation of spanning paths, AH invokes a 
Dijkstra algorithm to start a traversal from a boarder node of P^+i . 
During the traversal, AH only visits the level-i cores. After the 
traversal is completed, AH only visits the edges from a level-i core 
to a boarder node to obtain a spanning path. Hence, AH only uses 
the edges concerning the boarder nodes of Pi+i and the level-i 
cores. As such, it suffices to only add shortcuts concerning the 
boarder nodes and level-i cores. D 

Lemma 3. For any two nodes u,v £ G, if no (3x3)-cell re- 
gion in Ri (i £ [1, /i]) can cover u and v simultaneously, then the 
shortest path from utov must go through a node at level i or above. 

Proof. This lemma follows from Statement 4 of Lemma [S] 
when Bs is the (4 x 4)-cell region in Rh where Bs covers the 
entire road network G. D 

E. PROOF OF LEMMA 4 

Our proof of Lemma|4]is based on the following lemma. 



Lemma 9. Let B be any (4 x A)-cell region in Ri (i € [1, h]), 
Eb be the set of pseudo-arterial edges for B, and Vb be a set 
containing the endpoints of the edges in Eb- Then, the number of 
nodes in Vb is at most 50A . 

In what follows, we will first prove Lemma|4]based on Lemma|9] 
and then establish the validity of Lemma|9] 

Lemma 4. Any (axa)-cell region in Ri contains 0{a^\^) 
nodes whose level in H* are no lower than i, where A is the ar- 
terial dimension of G. 

Proof. Consider the (a x a)-cell region A in Figure[TT] as well 
as the ((a+6) x (a+6))-cell region that is centered at A. Contained 
in A, the nodes whose level in H* are no lower than i (i.e., the 
level-i cores), can be categorized into two overlapping groups. The 
first group contains the level-i cores selected due to a spanning path 
of a (4 X 4)-cell region B\ completely covered by the ((a+G) x 
(Q:+6))-cell region, and the second group consists of the level-i 
cores selected from a region (4x4)-cell region B2 that is disjoint 
from A. 

The number of nodes in the first group is50-(a + 3)^A^. This 
is because (i) the number of (4x4)-cell regions contained in the 
((a-l-6) X (a+6))-cell area is (a + 3)^, and (ii) each (4x4)-cell 
region generates at most 50A^ level-i cores followed by Lemma|9] 
(note that the endpoints of the pseudo-arterial edges in Ri are level- 
(i — 1) cores). Meanwhile, all nodes in the second group also appear 
in the first group. To explain, note that for any level-i core u in 

A, and any (4 x 4)-cell region B2 that is disjoint from A, if u is 
selected as a level-i core due to a spanning path P of B2, then, by 
Statement 2 of Lemma[8] P contains only one edge since u is not 
in B2. It could be verified that there should exist a (4 x 4)-cell 
region B in A, such that B contains u, and P is also a spanning 
path of B, as exemplified in Figure [T2] In other words, the node 
u is also contained in the first group mentioned before, because u 
is selected due to P, a spanning path of B. As a consequence, the 
total number of level-i cores in A is at most 50- (a + 3)'^ -A^, which 
is 0{a^X^). D 

It remains to prove Lemma |9] The key idea of our proof is to 
show that, for any (4 x 4)-cell region B in Ri, the spanning paths of 
B contain 0(A ) level-(i— 1) cores, which results in O(A^) level- 
i cores selected for any i £ [1, /i]- To facilitate our proof, in the 
following, we will first establish some properties of the shortcuts 
in H* (in Lemma [TOt. Next, based on Lemma [TO] we demonstrate 
the characteristics of certain spanning paths of B (in Lemma [TTt. 
Subsequently, we will employ Lemma [TTI to show a general prop- 
erty of every spanning path in a (4 x 4)-cell region B in Ri (in 
Lemma[72](. Finally, we will prove lemma|9]based on Lemma [T2I 

Lemma 10. Let B be a (A x 4)-cell region in Ri, and {u, v) a 
shortcut created in B. Then the path contracted by {u, v) is con- 
tained in B. 

Proof. This lemma could be proved by mathematical induc- 
tion. Let P be the path contracted by {u, v). First, we show that 
the lemma holds when i = 1. Assume to the contrary that the path 
contracted by (u, v) is not contained in B. Then, there is a node 
X on P, such that x is beyond B. Let {x, y) be the edge on P. 
Such y exists because x ^ v given {u, v) is a shortcut created in 

B, which follows that v should be in B. In that case, (x, y) is not 
an edge generated from B because x is not in B. Second, suppose 
that the lemma holds when i — k,we show that it also holds when 
i = fc + 1. By contradiction, let x be the node on P that is not 
in B. If X is not a level-fc core, x is not visited during the cre- 
ation of {u, v). It follows that x is contracted by a shortcut e where 



e is generated from a sub-region of B, and given the lemma true 
when i — k, X should be in B. If s is a level-fc core, let {x, y) be 
the edge visited during the creation of {u, v). Then {x, y) violates 
the coverage condition since x is beyond B. Hence, the lemma is 
proved. D 

Given Lemma [Tol we have the following lemma: 

Lemma 1 1. Let B be a (Ax 4)-cell region in Ri, P G Vi,B be 
a path from a to t, and (it, t) be the last edge on P. Ift is beyond 
B, then u, t are level-(i—l) cores. 

Proof. Apparently the lemma holds when i = 1 given all 
the nodes in G are level-0 cores. Suppose that the lemma holds 
when i = fc, in the following we show that the lemma true when 
i = fc + 1. First, t is a level-fc core, since except for the source 
node s, the Dijkstra traversal only visits the level-fc cores to find 
the spanning paths. Second, we show that {u, t) is visited by the 
Dijkstra traversal. Suppose that {x,t) is the edge on P relaxed by 
the Dijkstra traversal. Then, {x, t) should be an original edge be- 
cause by Lemma [TO] if {x, t) is a shortcut, t should be in B, which 
violates the hypothesis. Hence, a: = -u. In what follows, u, t are 
both level-(i— 1) cores because the Dijkstra traversal would only 
visit the level-(i— 1) cores. Therefore, the lemma holds. D 

Given Lemma [TTI the following lemma is proved. 

Lemma 12. Let B be a(4 X 4)-cell region in Ri (i > 2), P £ 
Pi,B a path from a node s to another node t. Let {a, b) be an 
arterial edge on P. Then, there exists a (4 x 4)-cell sub-region of 
B in Ri-i (denoted as B'), such that the sub-path of P from a to t 
covers a path in Vi-i b'- 

Proof. Suppose that Pa.t is the sub-path of P from a to t. 
Without loss of generality, suppose that s is in the west strip of 
B, and (a, b) goes across the vertical bisector of B (denoted as 4) 
where a (resp. b) lies at the west (resp. east) of lb- According to 
the definition of Spanning Path, in that case, t should be at the east 
of lb, and t is not at the cell adjacent to It. It could be verified that 
a and t are far-apart in Ri-i since the side length of a cell in Ri 
is two times of that in Ri~i. If t is contained in B, then, followed 
by Statement 3 of Lemma[8l this lemma holds. In the following we 
consider the case where t is beyond B. 

Let {u,t) be the last edge on Pa,t- We denote the sub-path of 
Pa.t from a to w as Pa.u- If on Pa,u there exist two node x, y, such 
that x and y are far-apart in Ri-i, then, followed by Statement 3 
of Lemma [8] this lemma holds. Next, we consider the case when 
there is a (3 x 3)-cell sub-region of B in Ri-i, such that the sub- 
region contains all the nodes on Pa.u- Let ai be the node on Pa,u 
which has the smallest x-coordinate. It could be verified that there 
is a (4 X 4)-cell sub-region of B in Ri-i, denoted as B', such that: 
(i) B' contains all the nodes on Pa,u, and (ii) ai is in the west strip 
of B' . Let V be the node on Pa.u such that (i) v is in the west strip 
of B' , and (ii) t; is a boarder node of B' . Figure [T6] illustrates an 
example of the positions of a, b, ai, v, u, t, B and B' as described 
above.Then, the path P„,t from v to t is in Vi-i^B' because: (i) t is 
at the east of h, but is not in the cell adjacent to It, which indicates 
that t is at the east of the vertical bisector of B' (denoted as l'^), and 
t is not in the adjacent cell of I'l,, and (ii) followed by Lemma [TT] 
both u and t should be level-(i— 1) cores because {u, t) is the last 
edge on P, which indicates that u, t are level-(i— 2) cores. Hence, 
Pu,t is in Vi-i^B' > and therefore the lemma holds. D 

Given Lemma [T2I we prove Lemma|9]as follows: 
Proof Of Lemma[91 When i = 1, the edges in Eb are arte- 
rial edges of B. Then, there are at most 2A nodes in Vb. In the 
following, we show that the lemma also holds when i > 2. 
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Figure 16: A sub-path from a to t. 



Each pseudo-arterial edge e G Eb is either an original edge in 
G or a shortcut. We consider two disjoint subsets Ei and E2 of 
Eb, where Ei contains all the original arterial edges, and E2 the 
shortcuts. Suppose that \Ei\ = Ai and \E2\ = A2. Then, there are 
at most 2Ai nodes in Vb that are endpoints of the edges in Ei. In 
what follows, we consider E2. 

We divide E2 into several disjoint subsets according to the ar- 
terial edges e £ E2 contracts, i.e., the shortcuts that contract 
the same arterial edge are in the same subset. For each sub- 
set Esub, we show that there are at most 50A nodes in Vb that 
are endpoints of the edges in Esub- Suppose that the shortcuts 
in Esub are {Xi,Yi), ..., {Xk,Yk) (note that by the AH algo- 
rithm, Xi , . . . , Xk ,Yi, . . . ,Yk are all level-(i— 1) cores), and they 
are respectively on the paths P\, . . . ,Pk to the nodes ti, . . . ,tk- 
We use (a, b) to denote the arterial edge that those shortcuts 
{Xi, Yi), . . . , (Xfc, Yfc) contract. Besides, we use Pj to denote the 
sub-path of Pj from a to tj for all 1 < j < k. Let Ga be a graph 
which is composed of the edges e where e is on Pj. Then, Ga is 
a tree, otherwise, it violates the hypothesis that the local shortest 
paths in B are unique. Besides, Vi, . . . , Yfc is in Ga, the tree. We 
show that there are at most 25 A distinct nodes among Yi, . . . , Yfe. 
By Lemma [T2I for each Pj, (1 < J < k), there exists a (4 x 4)- 
cell region in Ri^i (denoted as B'), where B' is a sub-region of 
B, such that Pj covers a path in Vi-i^B' ■ And followed by State- 
ment 2 of Lemma[8] P'j is covered by a level-(i— 1) core. Since B' 
contains at most A arterial edges, and there are at most twenty-five 
sub-regions of B, hence, there are at most 25A level-(i— 1) cores 
that cover the paths P'j for 1 < j < fc. On the other hand, on P'j, 
among the level-(i— 1) cores, Y, is the closet one to a, otherwise, 
if a level-(j— 1) core u is on the path from a to Y,, then {Xj, Yj) 
contracts a level-(j— 1) core, which violates the algorithm. Hence, 
there are at most 25 A distinct nodes of Yi, . . . , Yfe. Symmetrically, 
the number of distinct nodes of X\ , . . . , Xk is at most 25A as well. 
Hence, in Vb, there are at most 50 A nodes that are endpoints of 
the edges in Esub- In addition, given \E2\ = A2, there are at most 
A2 disjoint subsets, which follows that, in Vb, there are at most 
50 A • A2 nodes that are endpoints of the edges in E2- 

On the other hand, if e G Eb is an original edge, e cannot be 
contracted by other shortcuts in AE at the same time because both 
of the two endpoints of e are level-(i— 1) cores. Hence, totally there 
are at most 50A^ nodes in Vb. D 



F. PROOF OF THEOREM 1 

We prove Theorem[T]by presenting a series of lemmas and theo- 
rems that establish the space and time complexities of AH, as well 
as the correctness of AH's query processing algorithms. 

Lemma 13. Each node in H* has O(A^) non-elevating edges. 

Proof. Let m be a node at level i, and {u, v) a non-elevating 
edge. Under the grid Ri+i, consider the (5 x 5)-cell region centered 
at u (denoted as C). Then, v is in C, because the algorithm firstly 
builds a SPT (see definition [3]( from u in C, and then adds non- 
elevating edges from u to the level-i cores on the SPT. Besides, by 
Lemma|4l there are O(A^) level-i cores in C, which is a (10 x 10)- 
cell region in Ri. Hence, the number of non-elevating edges is 

Lemma 14. Each node in H* has 0{h\^) elevating edges. 

Proof. Let {u, v) be an elevating edge. Then v has a higher 
rank than that of u. Suppose v is at level i. It suffices to show that 
for a fixed i, the number of such elevating edges is O(A^). This is 
because (m, v) is obtained from a SPT rooted at u generated in a 
(5 X 5)-cell region centered at m in Pi, and by Lemma|4]the number 
of such node u is O(A^). In the worst case, u is at level 0, and AH 
adds elevating edges from u to the level-i nodes where i G [1, /i]. 
Hence, u has 0(h\^) elevating edges. D 

Theorem 3. The space overhead of AH is 0{hn). 

Proof. By Lemmas [T3l and [Til each node has 0{hX^) edges, 
and there are n nodes in G. Hence, the space overhead is 0{hn) 
when A is a constant. D 

Theorem 4. AH answers a distance query in 0{h log h) time. 
Besides, it answers a shortest path query in 0{hlogh + k) time, 
where k is the number of edges in the shortest path. 

Proof. AH answers any distance query with two traversals of 
H* , starting from the source s and destination t of the query, re- 
spectively. Due to the proximity constraint, in the i-th level of 
H* (i G [0, h]), each traversal of AH only visits the nodes in a 
(5x5)-cell region in Ri+i. By Lemma|4l such a region only con- 
tains O(A^) level-i cores because a (5 x 5)-cell region in Ri+i is 
a (10 X 10)-cell region in Ri. Hence, the total number of nodes 
traversed by AH is 0{h\^). Furthermore, for each node v visited 
during a traversal, AH either follows the elevating edges of t; to a 
certain level of H* , or moves along the non-elevating edges of v 
that satisfy the rank and proximity constraints. As previously dis- 
cussed in Lemma [14] v has O(A^) elevating edges to each level 
of H", and has O(A^) non-elevating edges. Therefore, the total 
number of edges visited by AH is 0(/iA^). Since each traversal is 
performed using Dijkstra's algorithm, its overall time complexity 
is 0{hX^ log(/iA^) + /lA'*). Consequently, when A is a constant, 
the time complexity of AH for a distance query is 0{h log h) . 

To answer a shortest path query from s to t, AH first processes its 
corresponding distance query to retrieve the shortest path P' from 
s to t in H* , and then its transforms P' into the actual shortest path 
P from s to i in G. For each shortcut e, it requires 0(1) time to 
decompose e into two edges, and an original edge cannot be further 
decomposed. Hence, the transformation from P' to P takes 0{k) 
time, where k is the number of edges in P. Therefore, AH requires 
0{h log h + k) time to answer a shortest path query. D 

Lemma 15. AH requires Oifin ) time to construct H* . 

Proof. The preprocessing algorithm of AH consists of three 
steps: (i) assigning nodes to each level of H* , (ii) deriving the 



strict total order on nodes at the same level, and (iii) creating short- 
cuts in H* . As for (i), When assigning nodes to the i-th level of 
H* {i £ [0,h — 1]), AH inspects each non-empty (4x4)-cell re- 
gion in Ri+i, and constructs a subgraph that consists of the level-i 
cores and border nodes in the region. For each boarder node u, AH 
needs to apply Dijkstra's algorithm to traverse the subgraph. By 
Lemma|4l O(A^) level-i cores are visited during the traversal, and 
each node visited has O(A^) edges. Hence building a Dijkstra tree 
from u require 0(A*) time. After the Dijkstra tree is constructed, 
AH needs to inspect each node in the tree and the boarder nodes to 
find out a spanning path which requires 0{n\^), because there are 
O(A^) nodes in the tree and n boarder nodes for a loose estimation. 
On the other hand, u is contained in a constant number of (4x4)- 
cell region in Ri, hence, it requires O(nA^) time to find out the 
spanning paths from u. As such, it requires O(n^A^) time to find 
out all the spanning paths. As for (ii), AH takes only 0{n) time 
to derive the strict total order at level i of H* , since the derivation 
is based on a linear time algorithm for vertex cover. As for (iii), to 
construct shortcuts at the i-th level of H*, AH needs to inspect a 
graph G* reduced from G. For each node u in G* , AH invokes a 
Dijkstra's algorithm to start a traversal in the (5x5)-cell region in 
Ri+i that is centered at u. After that, it creates shortcuts from u 
by traversing the nodes in the Dijkstra tree obtained. By Lemma|4l 
there are O(A^) nodes in the tree. Hence, it costs 0{\^) time to 
generate level-i shortcuts from it. As such, the time required to 
create shortcuts at level i of H* is O(nA^). There are h levels, 
and therefore, AH costs 0{hn\^) to create shortcuts. Since A is a 
constant, AH totally requires 0{hn^) time to construct H* . D 

To prove Theorems|5]and|6] we need the following lemma: 

Lemma 16. For any two nodes s and t, there exits a path P 
from s to t such that (i) the weight of P equals the weight of the 
shortest path from s to t in G, and (ii) the rank sequence of P is 
unimodal. 

Proof. Among the nodes on the shortest path from s to i in G, 
let u be the one with the highest rank. Let Ps,„ be the shortest path 
from s to u in G. It suffices to show that there is a path Pf from 
s to u, such that (i) the weight of Pf equals the weight of Ps,u, 
and (ii) the rank sequence of Pf is increasing. We use rank{u) to 
denote the rank of a node u. 

Among the nodes on Ps^u let v be such a node that: (i) 
rank{s) < rank(v), and (ii) if v' is another node on Ps.u that 
has a higher rank value than that of s, v is closer to s than v' . Let 
Ps,u be the sub-path of Ps.u from s to v. Then Ps.v is also a short- 
est path, and among the nodes on P^,^ v has the highest rank and s 
comes the second. Otherwise it violates the second property of v. 
If Ps^v contains multiple edges, by the algorithm (s, v) is a level- 
rank{s) edge, and the weight of (s, v) equals the weight of Ps,u. 
It means that (s, v) is an edge on Pf. Then in a similar way we 
continue to consider the path from v to u. Therefore, the lemma is 
proved. D 

Theorem 5. AH correctly answers any distance query. 

Proof. Suppose Pa,t is a shortest path from s to t in G. 
Lemma [T6] shows there exists a unimodal rank sequence path P 
from s to f where the weight of P equals the weight of Ps.t- i.e., 
the Rank Constraint would not affect the query correctness. Now 
we show that the Proximity constraint would not affect the correct- 
ness either. 

Let u be the node with the highest rank among all the nodes on 
Ps,t- Let Pf be the shortest path from s to u and the rank sequence 
of Pf is increasing. It suffices to show that the proximity constraint 



in the forward search would not affect the discovery of Pf . By con- 
tradiction, suppose that n is a level-i node on Pf, but v is beyond 
the (5 X 5)-cell region in Ri+i centered at the cell that contains 
s. Let Ps.v be the shortest path from s to v. Then, in this case, 
Ps^v contains multiple edges, otherwise, since s and v are far-apart 
in Ri+i, and followed by Statement 2 of Lemma[8] s and v are 
both level-(i-l-l) cores, which violates the assumption v as at level 
i. Besides, by Statement 4 of Lemma[8l there is a node level-(i-fl) 
core X on Ps,v and x j^ v. Then, it violates the rank-increasing 
property of Pf because x comes before v on Pf. 

Finally, we show that the elevating edges would not affect the 
cortectness. Let Rj (j £ [1, h]) be the coarsest grid where no 
(4x4)-cell region contains both s and t. By Lemma[3] the shortest 
path from s to f should passes through at least one node whose 
level is at least j. This indicates that AH's traversal from s would 
meet its traversal from t at level i or above of H. As such, omitting 
visiting the nodes that are lower than level j would not affect the 
cortectness. Therefore, the lemma is proved. D 

Theorem 6. AH correctly answers any shortest path query. 

Proof. By theorem [5] given two nodes s and f, the query al- 
gorithm can discover a unimodal node rank sequence path P. It 
remains to show that every shortcut e on P can reconstruct the path 
e contracts. There are two types of shortcuts: elevating edges and 
non-elevating edges (i.e. the level-i edges). 

As for the non-elevating edges, it suffices to show that the non- 
elevating edges on the rank-increasing path discovered by the for- 
ward search from s can be reconstructed. Suppose that {u, v) is a 
non-elevating edge on the path discovered by the forward search. 
Then, (u, v) contracts the shortest path Pu,v from u to v. In ad- 
dition, the rank of v is higher than that of u. Suppose that {u, v) 
is marked with a node w. Then the sub-path Pw,v (resp. Pu,w) 
from w \.o V (resp. from u to w) of Pu,v is also a shortest path. 
It suffices to show that {w, v) (resp. (it, w)) is an original edge or 
a non-elevating edge. If Pw,v contains multiple edges, since by 
the meaning of lu, among the nodes on P«,,„, v has the highest 
rank and w comes the second, hence, AH would add a level-i edge 
{w, v) where w is at level-i. On the other hand, we can also have a 
similar conclusion of (it, w) if we build the non-elevating edge to 
10 from a Backward SPT rooted at k;. 

As for the elevating edges, we use mathematical induction to 
prove that every elevating edge e on the shortest path P could be 
reconstructed. First, let {u, v) be an elevating edge where (i) it is 
a node at level-0, (ii) v at level-1, and (iii) (it, i;) is marked with a 
node w. Then it; is the first node that has higher rank than that of 
u on the shortest path from u to v. In that case, (it, w) is a level-0 
edge (non-elevating edge), and as discussed before, (it, w) could 
be reconstructed. If w ^ v, we continue to consider (in, v). Since 
w is also at level-0, and v is the first node at level-1 on the shortest 
path from w to v. Then {w, v) is also an elevating edge. Similar to 
the discussion aforementioned, (it), v) could also be reconstructed. 
As such, all the elevating edges to a node at level 1 on the shortest 
path could be reconstructed. Suppose that the elevating edges e 
to a level-i node on a shortest path P could be reconstructed. We 
show that the elevating edges e to a level- (i + 1) node could also be 
reconstructed. Let (it, v) be an elevating edge where v is at level- 
(i+1). There are two types of elevating edges: (a) it is a boarder 
node of Ri+i, and (b) it is a node at level i. We firstly consider (a). 
Let T be an SPT from it. Let it; be the node immediately follows 
u on the branch from it to « in T. Then it) is a level-i core. In that 
case, (it, It)) is also an elevating edge because: (i) it is a boarder 
node of Ri+i implies that it is also a boarder node of Ri, and (ii) 
It) is a node at level-i that is closet to u on the path from it to w. By 



induction, {u, w) could be reconstructed. If w /u, we continue to 
consider {w, v). Since the shortest path from w io v does not go 
through another node at level i+1, (w, v) is an elevating edge of 
type (b). It remains to show that the type (b) elevating edges could 
be reconstructed. Let {u, v) be a type (b) elevating edge, i.e., u is 
at level i — 1 and v at level i. By the algorithm, {u, v) is marked 
with a node w, the first node that has a higher rank than that of u 
on the shortest path from u to v. Then, {u, w) is a level-i edge and 
could be reconstructed. If w ^ v, we continue to consider w, and 
similarly, {w,v) is also an elevating edge of type (b) marked with 
w' . As such, there is an edge {w' , v) such that w' is at level i, and 
the nodes on the shortest path from w' to v (excluding w' and v) 
have lower ranks than that of w' . In what follows, {w' , v) is also a 
level-i edge, and could be reconstructed. Similarly, all the elevating 
edges from a level-(j+l) node to the boarder nodes of Ri+\ and the 
level-i nodes could also be reconstructed. 
Therefore, the lemma is proved. D 



